Guess thats good know that I'm just finding out the behavior of the card more. Is there any reason the titan can't benefit from the advances for the other kernels though? Shouldn't it be using the best kernel not compiled for compute 3.5 until it is understood why the nvcc compiler seems to break 3.5?
If I don't compile for compute 3.5, you don't get to use the funnel shifter. If I add my memory optimizations into compute 3.5 code, you get a crash. I think the funnel shifter may outweigh the memory optimization benefits.
Oh, and by the way I have a windows build that runs on 1/4 the CPU load it used before.
Christian
I'll be sure to throw a few more coins your way. If I could i'd just permanantly donate a few percent of my earnings