You only need to store 0.5 sqrt(N) keys (the baby steps) : from 1*G to 0.5*sqrt(N)*G
And next step is to calculate required memory for 0.5*sqrt(2^134) points.
After this calculation you will lose all your interest in BSGS

Or may be you are going to break a lot of <64bit points? Even so, kang is faster because you can have precomputed db of DPs and reduce time in 10 times or even more.
If the goal is to retrieve a private key in a 2^134 interval, there is no doubt: BSGS is not the algorithm to use.
But in a 2^64 interval, BSGS can be faster.
For one 64bit point without precomputing - yes. For many points - kang with precomputed DPs is faster.
Of course you can do the same for BSGS, precompute more points to speedup it, but it won't be so effective as for kang, to speedup BSGS in 10 times you will have to precompute (and store) much more points than for kang (because BSGS does not use birthday paradox sqrt).
Stop promoting garbage. If you precompute certain values, you might lose the effect of the birthday paradox in the sense that you are reducing randomness and possible combinations. Precomputation can make the problem more deterministic and less dependent on the random coincidences that make the birthday paradox so surprising. You are leading people into a void.