Unfortunately, that post also contains some significant errors even in the basic explanation of how scrypt works. If that post's explanation were correct, it would not be possible to store less than the full 128kB buffer and exploit the obvious TMTO (which mtrlt nicknamed "lookup gap", which is a term anyone mining Litecoin with GPU's will probably recognize). Every GPU scrypt miner exploits exactly that TMTO to speed up the process by not storing the full 128kB.