I agree with Ethash being a bonus / using extra memory - but your HMC costs more
than a GPU that gives you the same bandwidth.
I've never been able to find any reliable data on the memory bandwidth / efficiency utilization on gpu mining. If only 50% efficiency on memory bandwidth utilization is being achieved with gpu GDDR/HBM vs 90% on HMC or HBM w/ FPGA. That would mean the fpga could achieve hashrates +40% over gpu if the biggest limiting factor was memory bandwidth (even more if it's not). Is it even possible to measure the utilization efficiency of the memory bandwidth with gpu miner code? That's the one piece of information i've been missing in my hashrate estimations for algos using these memory types.
Sure, we can say a gpu is more cost effective, but it's not clear that it actually is.
Can you experts please stop blabbering and start coding! Crypto community neads you to provide viable and usable Bitstreams that is early adoptors can use. Who ever release the first solution will most likely make back your development time 6 love...
Lol, I don't code bitstreams myself

-- I've got a basic understanding best case.
Luckily since EThash is perfectly randomly distributed reads from a large set, cache effects are almost entirely negligible. That means you can almost perfectly estimate memory speed from hashrate.
Each hash takes 8192 bytes of memory access (64 * 128). So 30 MH = 228 GB/s. The 570 is advertised at 224 GB/s IO @ 1750 clock, and 30 MH is usually a 15% overclock so theoretical 256 GB/s. 89% efficinency. The rocm guys will confirm numbers up to 90% for Ethash. I dont know how it translates to other algorithms.