A GPU with a 32MB cache would then be limited to it's cache bandwidth instead of the external GDDR5 bandwidth.
Sorry about the necropost here, but isn't the dataset for ZEC more than 32MB? You wrote:
Specifically, 2 million pseudo-random numbers are generated using blake2b (see
http://blake2.net/). Each of these numbers is 200 bits (25 bytes), and they are sorted
200bits*2million = 400mbits = 50mbytes, right?
If I got the math wrong, or if there is some trick for halving the storage requirements, I would be happy to be corrected. Thanks!