With last ptx optimisation (forgot about simmetry in batch point addition)
solve 16 pubkeys from JLP in 58s
...
GPU#0 Cnt:0000000000000000000000000000000000000000000000000000000000000001
GPU#0 Cnt:0000000000000000000000000000000000000000000000004673f00000000001 1121MKey/s x1073741824 2^30.13 x2^31=2^61.13
***********GPU#0************
KEY!!>49dccfd96dc5df56487436f5a1b18c4f5d34f65ddb48cb5e7ad38337c7f173c7
Pub: 55b95bef84a6045a505d015ef15e136e0a31cc2aa00fa4bca62e5df215ee981b3b4d6bce33718dc6cf59f28b550648d7e8b2796ac36f25ff0c01f8bc42a16fd9
****************************
Found in 4 seconds
GPU#0 job finished
Working time 00:00:58s
Total time 00:06:33s
GPU#0 thread finished
cuda finished ok
Press Enter to exit
Seems like it is the maximum that I can achieve in single 2080ti.
Ofcourse JLP would probably have done it even faster
