but I had tweaked the way DPs were found.
Can we have your code?
+1 on this, can we see at least to logic, if you do not want to share your work you used to change that, perhaps a code snippet?
And did you reached that speed without anyother change? Going from 2500Mk/s to almost 7750Mk/s is something.
Also I do not see in any class from the JLP Kangarroo the use of GPUGroup.h, is that generated separately or it was migrated from VanitySearch and keept in the project?
Why the use of only 128 in GPU_GRP_SIZE as in KeyHuntCuda was 2048?
What's the relation between this constants?
// Number of random jumps
// Max 512 for the GPU
#define NB_JUMP 32
// GPU group size
#define GPU_GRP_SIZE 128
// GPU number of run per kernel call
#define NB_RUN 64