Currently I have about 12.8GKeys/s on 4090. 5090 is a shame, I skip it and wait for next generation.
Perhaps I will make all my sources public when #135 is solved, though I'm not sure, people are not interested in what I do, also I see zero good discussions on this forum about EC, so better I will spend my time for more interesting things

Yes, there are surely many people intrigued by your code; it's just that not all of us have thousands of dollars to explore or buy a high-end PC. What's more unfortunate is that those who do have the means don't offer anything just theories backed by zero code, which is a vague and empty argument. I admit that I plan to include in your final version of Rckangaroo the different kangaroo methods to verify if SOTA is the main factor or if it is the optimization of CUDA code.