Search Posts

Post

Topic

Board Bitcoin Discussion

Re: == Bitcoin challenge transaction: ~1000 BTC total bounty to solvers! ==UPDATED==

678AFDB0

on 13/08/2023, 10:57:40 UTC

Quote from: momofukku on August 11, 2023, 12:36:18 AM

Quote from: 678AFDB0 on August 10, 2023, 08:08:01 PM

Hey,

Thanks for the video, really informative!

Quote from: momofukku on August 10, 2023, 10:00:51 AM

In the video I use an Nvidia H100 and it floats around 3500 mk/s

Is that card really that slow ? For 35k USD i expected much more.

Quote from: momofukku on August 10, 2023, 10:00:51 AM

Without the proper mathematics all the computer power in the world is just wasted energy so if your gonna shell out the money to run clusters make sure your math is on point =D

Also one other thing not mentioned often, but when you run large clusters, you need to make sure no mistakes can happen, so you either underclock/undervolt the chips/gpus/fpgas/whatever or repeat chunks around, etc. Imagine running whole floor of machines for few months and one machine craps out on the chunk with the winning key ;(

Your welcome =D I hope it can help get more people to test the integrity of the blockchain & bring a bit of awareness to how it functions.

I think there are a number of huge optimizations that could be made to increase the mk/s for the H100's but it would require some changes to the .cu files & headers & whatnot which is over my head haha. I'm definitely gonna play around anyways. All of that bandwidth & memory is not being leveraged properly most dolphinately.

So many questions come to my mind when you brought up failures when running clusters. Would you be able to use the save work function to recover? I wonder what types of redundancies could be put in place, like you mentioned underclocking/volting. From what I was just reading when they trained Bloom 176b LLM they had some failure issues too. I was just reading their paper on the model so i'll copy paste what they were saying:

"During training, we faced issues with hardware failures: on average, 1–2 GPU failures
occurred each week. As backup nodes were available and automatically used, and checkpoints were saved every three hours, this did not affect training throughput significantly."

They were using 384 NVIDIA A100 80GB GPUs (48 nodes) with 32 spare GPUs for about 3 & a half months! Trained on nuclear energy which is awesome too.

Really crazy what they did with AI and those top end video cards! Still not much changed in the last 15 years - ATI5970, 3200 cores at 800 Mhz,
today top card, only 5-6 times that, with double the frequency, and 20x times the cost. Ok, it has crazy amount of RAM and bandwidth, i am sure
it is important for harvesting people private data, too bad is not much use for us. I honestly believed some 20 years ago when trying to bruteforce
a TEA key(only 8 bytes), that in the future we will be able to flip 64 bits in a second with ease.

The biggest cluster i have personally run was only 20 video cards(10 machines), a friend run like 1500 cards, and that was a nightmare to maintain.

As for scaling up, the only option is to have some kind of server/arbiter/job manager, written in higher language that splits the keyspace into very
small chunks that serve to nodes for crunching, and then verify, barely trust the results. Because hw tricks are simply not enough if you'll be drawing MW from the power grid and pay people to maintain your racks.

Post

Topic

Board Bitcoin Discussion

Re: == Bitcoin challenge transaction: ~1000 BTC total bounty to solvers! ==UPDATED==

678AFDB0

on 10/08/2023, 20:08:01 UTC

Hey,

Thanks for the video, really informative!

Quote from: momofukku on Today at 10:00:51 AM

In the video I use an Nvidia H100 and it floats around 3500 mk/s

Is that card really that slow ? For 35k USD i expected much more.

Quote from: momofukku on Today at 10:00:51 AM

Without the proper mathematics all the computer power in the world is just wasted energy so if your gonna shell out the money to run clusters make sure your math is on point =D

Post

Topic

Board Bitcoin Discussion

Re: == Bitcoin challenge transaction: ~1000 BTC total bounty to solvers! ==UPDATED==

678AFDB0

on 30/07/2023, 10:07:17 UTC

Quote from: nomachine on Today at 06:59:46 AM

When we look at the differences, we can observe that they are roughly consistent, hovering around 0.4 to 0.6. Grin

hmm, who knew that random generator would be governed by the laws of normal distribution.

https://en.wikipedia.org/wiki/Random_variable

Post

Topic

Board Bitcoin Discussion

Re: == Bitcoin challenge transaction: ~1000 BTC total bounty to solvers! ==UPDATED==

678AFDB0

on 11/07/2023, 20:01:05 UTC

Quote from: rosengold on July 10, 2023, 01:35:11 PM

EDIT: These guys don't need money, the same team solved puzzle #120 and the money of both remains untouched at 3Emiwzxme7Mrj4d89uqohXNncnRM15YESs

Tongue

It makes sense, if @Etar is right in his calculation, that is at least 6 x 42U racks that suck some 80-90 kWh. Probably somebody that
already has some kind of business (industrial warehouse, proper power supply, etc) that makes good money, so all the expenses from the puzzle operation could be offset against company income and reduce taxes. No need to even touch the puzzle money.

BTW, congrats on the win! Well done.

Post

Topic

Board Bitcoin Discussion

Re: == Bitcoin challenge transaction: ~100 BTC total bounty to solvers! ==UPDATED==

678AFDB0

on 17/04/2023, 16:53:35 UTC

I think the real question is who withdrew 2600 USD from key 0x01 in under six minutes Wink

Post

Topic

Board Development & Technical Discussion

Re: What algorithm is this ?

678AFDB0

on 08/02/2023, 09:55:10 UTC

Quote from: ymgve2 on February 06, 2023, 04:41:56 PM

The way you parallelize these kinds of searches is to work with several different private key starting points in parallel, not trying to do several things at once at the low level. A fast FPGA implementation would have several of these curve adders alongside each other, each set up to work on different inputs.

Hello,

Thanks for all the tips! Yes, i also think this is the way. Currently running 9 in parallel per chip and need a huge heatsink, even for slow 50Mhz clock.

Post

Topic

Board Development & Technical Discussion

Re: What algorithm is this ?

678AFDB0

on 06/02/2023, 08:53:22 UTC

Quote from: BlackHatCoiner on February 05, 2023, 05:18:38 PM

Quote from: 678AFDB0 on February 05, 2023, 01:18:51 PM

Is generating public key x point from private key more costly than the euclidean inversion used in this type of point addition ?

Yes. Elliptic curve multiplication is generally considered more computationally expensive than any point addition, because it involves lots of double-and-add operations.

Quote from: 678AFDB0 on February 05, 2023, 01:18:51 PM

It seems to take absolutely forever compare to the multiplication.

Unless I misunderstood the question, neither the former nor the latter are very complex algorithmically.

Hey,

Thanks for your input! My point was that it seems to me that the inversion is some type of search algorithm and is well suited for a
CPU where the instructions are executed in serial fashion, and maybe can't be executed in parallel to take full advantage of the
FPGA architecture.

In the diagram i have posted, i don't see much room for changing the algorithm to execute in parallel, except maybe running the inversion logic in parallel with the last multiplication, but that gives only ~10% efficiency boost due to inversion being much slower than multiplication.

Is the inversion part of the scalar multiplication for generation x from k as well ?

According to our dear AI friend, this is what the process is:

Code:

// Initial point (x1,y1) and scalar value (k)
reg [255:0] x1, y1, k;

// Resulting point (x2,y2)
reg [255:0] x2, y2;

// Loop through each bit in the scalar value
for (int i = 0; i < 256; i++)
{
// If the current bit in the scalar is 1, add the initial point to the result
if (k[i])
{
// Use the secp256k1 elliptic curve equation to calculate the new x and y coordinates
x2 = (x1^2 * 3 + 7) % p;
y2 = (y1^2) % p;
}

// Double the initial point
x1 = (x1^2) % p;
y1 = (y1^2 * 2) % p;
}

So which part is the inversion ? '% p' ?

Sorry for the lame questions.

Post

Topic

Board Development & Technical Discussion

Re: What algorithm is this ?

678AFDB0

on 05/02/2023, 13:18:51 UTC

Quote from: ymgve2 on February 02, 2023, 04:18:52 PM

It seems to be plain standard point addition between a point and the base point (basically the same as doing +1 to the private key): https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication#Point_addition

And most vanity generators do work in the same way as this - they take a random private key, generate the public key (a costly operation), then continuously add the base point to the public key (a cheap operation) and hash the result.

Thank you!

Is generating public key x point from private key more costly than the euclidean inversion used in this type of point addition ? It seems
to take absolutely forever compare to the multiplication.

There don't seems to be a way to pipeline the operation as is more or less serial state machine(need next inversion for new point). With
private key to x and lots of logic at least high throughput could be achieved.

Post

Topic

Board Development & Technical Discussion

Re: What algorithm is this ?

678AFDB0

on 02/02/2023, 13:26:45 UTC

Quote from: mausuv on Today at 05:14:00 AM

Quote from: 678AFDB0 on February 01, 2023, 09:06:54 AM

Quote from: mausuv on February 01, 2023, 06:22:04 AM

Quote from: 678AFDB0 on January 31, 2023, 08:22:05 PM

It synthesizes ok on Altera Cyclone 4, takes around 15k gates, but bit slow. At 50 Mhz clock, it takes ~18uS to add a point and hashing takes 3uS.

https://github.com/fpgaminer/fpgaminer-vanitygen
how to run .v # fpgaminer_vanitygen_top.v
how to compile linex
tell me

You need to install Quartus Prime, then follow the 13 step guide.

But keep in mind this is mostly useful to learn Verilog and play with your FPGA board. For practical purposes it is best
to run VanitySearch on your CPU, it will be x10000 faster.

i try to install Quartus Prime lot of error my linex

send me your fpgaminer-vanitygen compile file please Embarrassed

# uplode your file this site https://www.transfernow.net/en send link please or gdrive , other link

Hey,

Sorry, but this is not my project, i just found it on GitHub and was wondering what is the name of the algorithm, that's it.

Post

Topic

Board Development & Technical Discussion

Re: What algorithm is this ?

678AFDB0

on 01/02/2023, 09:06:54 UTC

Quote from: mausuv on Today at 06:22:04 AM

Quote from: 678AFDB0 on January 31, 2023, 08:22:05 PM

It synthesizes ok on Altera Cyclone 4, takes around 15k gates, but bit slow. At 50 Mhz clock, it takes ~18uS to add a point and hashing takes 3uS.

https://github.com/fpgaminer/fpgaminer-vanitygen
how to run .v # fpgaminer_vanitygen_top.v
how to compile linex
tell me

Post

Topic

Board Development & Technical Discussion

Merits 1 from 1 user

Re: What algorithm is this ?

678AFDB0

on 31/01/2023, 20:22:05 UTC

⭐ Merited by krashfire (1)

It synthesizes ok on Altera 4, takes around 15k gates, but bit slow. At 50 Mhz clock, it takes ~18uS to add a point and hashing takes 3uS.

Post

Topic

Board Development & Technical Discussion

Merits 2 from 2 users

Topic OP

What algorithm is this ?

678AFDB0

on 31/01/2023, 17:27:33 UTC

⭐ Merited by vapourminer (1) ,ETFbitcoin (1)

Hello,

I was doing research on various Vanity search algorithms and stumbled upon this repo:

https://github.com/fpgaminer/fpgaminer-vanitygen

I did a state diagram to understand it better:

https://postimg.cc/YG010qSK

as the math behind it is well above my pay grade.

So my question is - what algorithm is that ? Is this a part of OpenSSL standard elliptic curve functions ? Or something custom ?

Unlike most Vanity search programs that usually generate random k, then produce x and hash it, this algorithm seems to be using
only public keys (x and y are needed for next iteration) and finds the next public key without dealing with private keys.

Thank you!