current groestlcoin performance on 3 GTX 780Ti (only mild overclock)
[2014-04-11 01:54:41] accepted: 111/111 (100.00%), 29119 khash/s (yay!!!)
[2014-04-11 01:54:46] accepted: 112/112 (100.00%), 29190 khash/s (yay!!!)
[2014-04-11 01:54:50] accepted: 113/113 (100.00%), 29117 khash/s (yay!!!)
[2014-04-11 01:54:54] accepted: 114/114 (100.00%), 29186 khash/s (yay!!!)
I am sure we can do better

Christian
I was wondering, wouldn't you be able to compile a module and have cudaminer/ccminer check to see if it is present, it uses your custom adaption of the algorithm, if not present uses the sph version. Wouldn't that satisfy gpl, since without the precompiled module it would still be fully functional? Example I have is gpl open source android kernel, that has precompiled drivers for prop hardware, etc. The precompiled module then could confirm that your software is what is calling it. Just a thought, instead of rewriting base code. A simple option would to offer as you do now a precompiled version and have a md5 checksum confirmation from your module.
Assuming, I understand correctly, you just need to write a small bash script to check with lsmod if your module is loaded then
send or not cudaminer.