Everything I wrote in this post is about GDDR5 independently of any particular GPU or even all of the GPUs on the market taken together. GPU memory controllers are
not optimized for random/scattered reads like you find in most cryptocurrency mining PoWs. I would not be surprised if no GPU is actually able to do scattered full-bandwidth reads at the minimum 256-bit granularity allowed by the GDDR5 spec; that's just not something that's a top priority for rendering video games.

I agree.
Graphics rendering mostly operate as
embarrassingly parallel convolutions on texture/framebuffer patches. Relatively random memory access (in VRAM) is much slower than on a CPU.
In addition, I find that in my OpenCL projects, random reads can often fall back into serialized fetches which will stall the CU as it waits for a few threads in wavefront to catch up with the rest.