Post
Topic
Board Development & Technical Discussion
Re: Solving ECDLP with Kangaroos: Part 1 + 2 + RCKangaroo
by
kTimesG
on 10/02/2025, 09:23:41 UTC
Hypotetical scenario: a RTX 5090 can do at least 13.0 G jumps/s at DP 32. Are there plans to improve RCKangaroo or is 9.3 Gk/s still a "very good" speed, compared to an optimized version?
In general, I'm not interested in further support/improvements of RCKangaroo, but since it's open-source, someone else can do it.

Thank you for taking time to respond!

While I do admire that you had the skills and resources to break three ECDLP problems in a row, judging by your expertise you know very well that everything is a tradeoff when it comes to programming. I still stand by all my previous comments regarding this: cycle handling slows down the jumps. Another way to view this is: even with a very fast optimized cycle-handling kernel such as yours (much faster than some whatever JLP reference fork), it can be made to run faster if we trade the resources for cycle handling to enabling more jumps. The question at the end of the day is: from what point on is it worth it to either have low "K" with slow jumps, or a higher "K" with faster jumps. And yes, I did manage to reach 13.4 G/s on a RTX 5090 without even compiling natively to ccap 12.0, so the question is even more interesting now.

I was more interested about one of your older replies, regarding the fact that an optimized version is not even twice as fast as RCKang, which hinted that maybe somehow you managed to reach 14 Go/s on a RTX 4090, which would have been fascinating, considering that the public version can't reach 8 G/s.