All programs from JLP are perfectly optimized and if it were possible to improve the result, then the author would definitely do it.
To "double the speed" there is only one solution.
This is to use not only the addition of the kangaroo jump, but also the subtraction.
This will not require many resources, but will double the number of tested points and thus the number of distinguished points.
But it will also slow down the progress of all kangaroos.
Can this help, maybe, or maybe it will just overload the hash table with extra distinguished points.
With this modification also need to turn off the check for dead kangaroos, because a kangaroo hitting the same position that was left
after the subtraction does not mean that the kangaroo is following the trail of another kangaroo.
But as i said above, the author would have done it if it had worked.
Im agree with you! I just wanna know how much speed kangaroo have with 4090 in Stock WO powerlimit..May be I doing something wrong but my speed on 4090 nearly Tesla V100 32Gb 1420 MKey/sec speeds