Hello, guys!
Ultra lightweight CUDACyclone is ready, speed is 1.3Gkeys/s on RTX4060.
Key feature - extremely low VRAM usage for rented gpu. Less than 500Mb VRAM on RTX4090.
It will work even if Vanity or Keyhunt doesn’t start.
And also this is a good studying sample for your education (why not)? Total 7 small files.
Link:
https://github.com/Dookoo2/CUDACycloneGreat work, thanks for sharing!
Do you have any idea why there’s such a big performance difference between the old and new versions? For example, the old one hits ~1048.8 Mkeys/s in 9s using only 512MB VRAM, while the new one runs ~933.5 Mkeys/s in 71s using 3GB VRAM on the same RTX 3060.
./CUDACyclone_old --range 2000000000:3FFFFFFFFF --address 1HBtApAFA9B2YZw3G2YKSMCtb3dVnjuNe2 --grid 256,512
======== PrePhase: GPU Information ====================
Device : NVIDIA GeForce RTX 3060 (compute 8.6)
SM : 28
ThreadsPerBlock : 256
Blocks : 8192
Points batch size : 256
Batches/SM : 512
Memory utilization : 4.3% (512.2 MB / 11.6 GB)
-------------------------------------------------------
Total threads : 2097152
======== Phase-1: Brooteforce =========================
Time: 9.0 s | Speed: 1048.8 Mkeys/s | Count: 8897329760 | Progress: 6.47 %
======== FOUND MATCH! =================================
Private Key : 00000000000000000000000000000000000000000000000000000022382FACD0
Public Key : 03C060E1E3771CBECCB38E119C2414702F3F5181A89652538851D2E3886BDD70C6
./CUDACyclone --range 2000000000:3FFFFFFFFF --address 1HBtApAFA9B2YZw3G2YKSMCtb3dVnjuNe2 --grid 256,512
======== PrePhase: GPU Information ====================
Device : NVIDIA GeForce RTX 3060 (compute 8.6)
SM : 28
ThreadsPerBlock : 256
Blocks : 8192
Points batch size : 256
Batches/SM : 512
Batches/launch : 64 (per thread)
Memory utilization : 26.5% (3.08 GB / 11.6 GB)
-------------------------------------------------------
Total threads : 2097152
======== Phase-1: BruteForce (sliced) =================
Time: 71.2 s | Speed: 933.5 Mkeys/s | Count: 70322919104 | Progress: 51.17 %%
================================= FOUND MATCH! =================================
Private Key : 00000000000000000000000000000000000000000000000000000022382FACD0
Public Key : 03C060E1E3771CBECCB38E119C2414702F3F5181A89652538851D2E3886BDD70C6
Interesting, that a new version is FASTER on a top-tier GPU like 5090. I have seen 9Gkeys/s On 5090 with —grid 1024,512. But another version of 5090 is slower, speed around 8.2-8.3 Gkeys/s. It depends on power consuption limit.
But I will check speed differences between versions. 4060 speed is the same both version.