1060 6GB card, 25 seconds total time, start to finish.
Start to finish tests is not a good way to compare 2 programs in a simple problem. My program use time to setup a good grid in order to solve the harder problem.
If I remove ptx and support for other cards than compute 6.1 I gain 10 sec's++
Well, if you have a "good" grid, then you would finish solving faster. Grid meaning how the points are spread out across the range?
"harder" problem? Don't come out with a small little 34 bit range test and then say yours is set up for something "harder".
Secondly, your program/bitcrack program takes way too long to distribute the points across the range. It doesn't support multiple gpus, and doesn't work with 30xx cards.
I await to see your "harder" problem...and the results.