Post
Topic
Board Bitcoin Discussion
Re: Bitcoin puzzle transaction ~32 BTC prize to who solves it
by
FrozenThroneGuy
on 04/09/2025, 05:06:36 UTC
Hello, guys!
Ultra lightweight CUDACyclone is ready, speed is 1.3Gkeys/s on RTX4060.
Key feature - extremely low VRAM usage for rented gpu. Less than 500Mb VRAM on RTX4090.
It will work even if Vanity or Keyhunt doesn’t start.
And also this is a good studying sample for your education (why not)? Total 7 small files.
Link: https://github.com/Dookoo2/CUDACyclone

Great work, thanks for sharing!
Do you have any idea why there’s such a big performance difference between the old and new versions? For example, the old one hits ~1048.8 Mkeys/s in 9s using only 512MB VRAM, while the new one runs ~933.5 Mkeys/s in 71s using 3GB VRAM on the same RTX 3060.

Code:
./CUDACyclone_old --range 2000000000:3FFFFFFFFF --address 1HBtApAFA9B2YZw3G2YKSMCtb3dVnjuNe2 --grid 256,512
======== PrePhase: GPU Information ====================
Device               : NVIDIA GeForce RTX 3060 (compute 8.6)
SM                   : 28
ThreadsPerBlock      : 256
Blocks               : 8192
Points batch size    : 256
Batches/SM           : 512
Memory utilization   : 4.3% (512.2 MB / 11.6 GB)
-------------------------------------------------------
Total threads        : 2097152

======== Phase-1: Brooteforce =========================
Time: 9.0 s | Speed: 1048.8 Mkeys/s | Count: 8897329760 | Progress: 6.47 %

======== FOUND MATCH! =================================
Private Key   : 00000000000000000000000000000000000000000000000000000022382FACD0
Public Key    : 03C060E1E3771CBECCB38E119C2414702F3F5181A89652538851D2E3886BDD70C6

Code:
./CUDACyclone --range 2000000000:3FFFFFFFFF --address 1HBtApAFA9B2YZw3G2YKSMCtb3dVnjuNe2 --grid 256,512
======== PrePhase: GPU Information ====================
Device               : NVIDIA GeForce RTX 3060 (compute 8.6)
SM                   : 28
ThreadsPerBlock      : 256
Blocks               : 8192
Points batch size    : 256
Batches/SM           : 512
Batches/launch       : 64 (per thread)
Memory utilization   : 26.5% (3.08 GB / 11.6 GB)
-------------------------------------------------------
Total threads        : 2097152

======== Phase-1: BruteForce (sliced) =================
Time: 71.2 s | Speed: 933.5 Mkeys/s | Count: 70322919104 | Progress: 51.17 %%

================================= FOUND MATCH! =================================
Private Key   : 00000000000000000000000000000000000000000000000000000022382FACD0
Public Key    : 03C060E1E3771CBECCB38E119C2414702F3F5181A89652538851D2E3886BDD70C6


Interesting, that a new version is FASTER on a top-tier GPU like 5090. I have seen 9Gkeys/s On 5090 with —grid 1024,512. But another version of 5090 is slower, speed around 8.2-8.3 Gkeys/s. It depends on power consuption limit.
But I will check speed differences between versions. 4060 speed is the same both version.