Did the new nvidia drivers and new cudaminer give any increases?
On keccak i seem to be getting 168mh/s... (before updating nvidia drivers). Is this good on a factory 780?
I am not getting more either. I will experiment with the way the 64 bit arithmetics is done. Maybe it will be faster using the uint2 type instead of uint64_t and by doing the 64 bit arithmetics with inline PTX. I don't trust the compiler - in particular because the performance on the T kernel is actually worse than with the K kernel, despite having more registers available and having the funnel shifter feature feature. Also seemingly small changes to the K kernel have catastrophic performance impact. Like manually unrolling the first and last loop iterations and removing a few variables known to be zero or irrelevant for the result.
Christian