RTX 3060 (non Ti) = 2,400 MKey/s (2.4GKey/s)
RTX 3600 Ti = 2,800 MKey/s (2.8GKey/s)
RTX 3070 = 3,100 MKey/s (3.1GKey/s)
RTX 3090 = 5,250 MKey/s (5.2GKey/s)
RTX 4090 = 7,750 MKey/s (7.7 GKey/s) (yes, you read that right lol)
Great work

Your speed test without power limits ?
I make few fixes and now see small speed up.
now RTX 3070(pl 170w) = 3217 Mkeys/s
I calc only dp32 for #125

I always cut power to 70% and adjust clocks.
Didn’t try without power limits.
I am running DP 32 as well but trialing bits versus leading.