Keep up the good work!
PS. Do you take Yacoin donations?
Yeah, you can donate to YBQ4hrUQqEb2EDip1NFwMAgZbvK8hJx5Tn
Good idea about starting a new thread for the scrypt-jane enabled cudaminer, once it is released.
I have made some changes to autotune reliability and speed. It will not assign less blocks than half the multiprocessor count in your card. For example on a GTX 780 it will start autotuning at 6 blocks now (the card has 12 SMX).
Also I made changes to how memory is allocated. The backoff value on Windows is currently 12% of the largest allocation it was able to make. On Linux it is a mere 2%. If I don't back off, autotune will crash pretty badly. It can still occasionally crash with launch timeouts though.
I find that my GTX 660Ti is a better investment than my new GTX 780 card (3 GB each, but 7 vs 12 SMX). At -L 2 the 660Ti totally beats my 780. Meh.
My GT 660 Ti uses -L 2 -l K64x2 -C 1 -b 32768 -i 0 and gets 3.7 kHash/s
Christian