Is there any more gain to be made with more warps per block?
The increase from 8 to 16 made a few khash/sec difference but i find with -i0 that my system is still responsive. I would expect it to be really laggy if the card was being pushed to its limits.
I would try it myself but my coding skills are still very limited when it comes to cuda and crypto.