Post
Topic
Board Mining (Altcoins)
Re: [ANN] cudaMiner - a new litecoin mining application [Windows/Linux]
by
Thirtybird
on 25/01/2014, 22:33:01 UTC
The fact that my scrypt-jane kernels use 4 threads per hash is actually the key for the decent performance of nVidia cards. It allows me to get 4 times the occupancy of the shader hardware, given the tight memory constraints. This... and the lookup_gap implementation I recently added.

In contrast, the currently fastest scrypt (non-jane) kernels run 1 thread per hash: There is no overhead for inter-thread communication and shader occupancy is not an issue considering the 128kb requirement per hash.

Because you are working on the AMD miner code, I think this information might be useful to you.

Christian

Interesting to know this.  Those options are available as command line options already, and I've done a lot of testing with different numbers of threads and lookup gap, and the extra threads don't really add to the performance on the AMD cards when doing scrypt-chacha at high N-factors.  In fact, sometimes only allocating one thread with twice the buffer size and running at a higher intensity yields better results because there are fewer in-memory collisions.  If I were running a higher-end AMD with a high shader count, and the memory to go with it, that might absolutely be the way to go.  The R7 250 is only 384 shaders, and the 240 is only 320.