Next scheduled rescrape ... never
Version 3
Last scraped
Edited on 11/06/2025, 12:55:42 UTC
The most amusing stuff using mutex locks and creating bloomfilters with the same inputs two times in row.
Code:
alexander@alexander-home:~/Documents/Test_Dir/Point_Search_GMP$ diff bloom1B.bf bloom1.bf
Binary files bloom1B.bf and bloom1.bf differ

What did you expect? I looked at your update, and you are simply creating multiple mutexes, one for each thread that runs process_chunk. And locking the entire loop. Basically protecting nothing.

That's not mutexes are for. You only need a single mutex, and you only need to lock the "bf.insert" call, not the entire loop (or else the entire loops will be exclusive).

I'd personally move the mutex to the bloom filter code, and further block only the actual code that accesses data which can potentially be shared (for example, the hashing part probably doesn't need exclusive access).

But I'm glad that at least you got to a case where you can clearly see that the output is wrong, when synchronization is missing. So which one of those 2 outputs is the right one? You'll never know, since they were basically in a race condition, running both in parallel under different mutexes (so, identical as not having a mutex at all).

If you wanna go fancy you can implement a multi-mutex scheme, one for each some memory area size, and only lock the specific mutex for the area the bloom filter writes. This may increase throughput, or it may not, the right balance needs to be found by trial and error. But this is not a programming thread, after all. Smiley

LE: another option is to compute the points in parallel, and queue them in a producer-consumer fashion. And consuming the queue in a single thread, that only does the BF insertions. This simply moves the sync on the queue itself, of course, if you don't want to mess with the bloom class.

Code:
for (int i = 0; i < POINTS_BATCH_SIZE; i++) { // inserting all batch points into the bloomfilter
                    BloomP.x = pointBatchX<[i>];
                    BloomP.y = pointBatchY<[i>];
                    std::lock_guard<std::mutex> lock(mtx);
                    bf.insert(secp256k1->GetPublicKeyHex(BloomP));
                }

Exactly. Locking the bf like this leads to nothing. The same. May be no diff. Might be multiple.
But the running instance yields the right result even so no matter what. Has no impact after all. 69 bits. Tested.

The thing was only not putting the mutex for each bloom in relative to them global scope. Updated accordingly.
alexander@alexander-home:~/Documents/Test_Dir/Point_Search_GMP$ diff bloom1.bf bloom1B.bf
alexander@alexander-home:~/Documents/Test_Dir/Point_Search_GMP$ diff bloom2.bf bloom2B.bf

Locking the batch compared to locking just bf.insert is faster time-wise.


Version 2
Edited on 04/06/2025, 13:25:27 UTC
The most amusing stuff using mutex locks and creating bloomfilters with the same inputs two times in row.
Code:
alexander@alexander-home:~/Documents/Test_Dir/Point_Search_GMP$ diff bloom1B.bf bloom1.bf
Binary files bloom1B.bf and bloom1.bf differ

What did you expect? I looked at your update, and you are simply creating multiple mutexes, one for each thread that runs process_chunk. And locking the entire loop. Basically protecting nothing.

That's not mutexes are for. You only need a single mutex, and you only need to lock the "bf.insert" call, not the entire loop (or else the entire loops will be exclusive).

I'd personally move the mutex to the bloom filter code, and further block only the actual code that accesses data which can potentially be shared (for example, the hashing part probably doesn't need exclusive access).

But I'm glad that at least you got to a case where you can clearly see that the output is wrong, when synchronization is missing. So which one of those 2 outputs is the right one? You'll never know, since they were basically in a race condition, running both in parallel under different mutexes (so, identical as not having a mutex at all).

If you wanna go fancy you can implement a multi-mutex scheme, one for each some memory area size, and only lock the specific mutex for the area the bloom filter writes. This may increase throughput, or it may not, the right balance needs to be found by trial and error. But this is not a programming thread, after all. Smiley

LE: another option is to compute the points in parallel, and queue them in a producer-consumer fashion. And consuming the queue in a single thread, that only does the BF insertions. This simply moves the sync on the queue itself, of course, if you don't want to mess with the bloom class.

for (int i = 0; i < POINTS_BATCH_SIZE; i++) { // inserting all batch points into the bloomfilter
                    BloomP.x = pointBatchX;
                    BloomP.y = pointBatchY;
                    std::lock_guard<std::mutex> lock(mtx);
                    bf.insert(secp256k1->GetPublicKeyHex(BloomP));
                }

Exactly. Locking the bf like this leads to nothing. The same. May be no diff. Might be multiple.
But the running instance yields the right result even so no matter what. Has no impact after all. 69 bits. Tested.

The thing was only not putting the mutex for each bloom to global scope.
alexander@alexander-home:~/Documents/Test_Dir/Point_Search_GMP$ diff bloom1.bf bloom1B.bf
alexander@alexander-home:~/Documents/Test_Dir/Point_Search_GMP$ diff bloom2.bf bloom2B.bf


Version 1
Scraped on 04/06/2025, 13:00:42 UTC
The most amusing stuff using mutex locks and creating bloomfilters with the same inputs two times in row.
Code:
alexander@alexander-home:~/Documents/Test_Dir/Point_Search_GMP$ diff bloom1B.bf bloom1.bf
Binary files bloom1B.bf and bloom1.bf differ

What did you expect? I looked at your update, and you are simply creating multiple mutexes, one for each thread that runs process_chunk. And locking the entire loop. Basically protecting nothing.

That's not mutexes are for. You only need a single mutex, and you only need to lock the "bf.insert" call, not the entire loop (or else the entire loops will be exclusive).

I'd personally move the mutex to the bloom filter code, and further block only the actual code that accesses data which can potentially be shared (for example, the hashing part probably doesn't need exclusive access).

But I'm glad that at least you got to a case where you can clearly see that the output is wrong, when synchronization is missing. So which one of those 2 outputs is the right one? You'll never know, since they were basically in a race condition, running both in parallel under different mutexes (so, identical as not having a mutex at all).

If you wanna go fancy you can implement a multi-mutex scheme, one for each some memory area size, and only lock the specific mutex for the area the bloom filter writes. This may increase throughput, or it may not, the right balance needs to be found by trial and error. But this is not a programming thread, after all. Smiley

LE: another option is to compute the points in parallel, and queue them in a producer-consumer fashion. And consuming the queue in a single thread, that only does the BF insertions. This simply moves the sync on the queue itself, of course, if you don't want to mess with the bloom class.

for (int i = 0; i < POINTS_BATCH_SIZE; i++) { // inserting all batch points into the bloomfilter
                    BloomP.x = pointBatchX;
                    BloomP.y = pointBatchY;
                    std::lock_guard<std::mutex> lock(mtx);
                    bf.insert(secp256k1->GetPublicKeyHex(BloomP));
                }

Exactly. Locking the bf like this leads to nothing. The same. May be no diff. Might be multiple.
But the running instance yields the right result even so no matter what. Has no impact after all. 69 bits. Tested.
AMD Ryzen Threadripper PRO 5995WX I guess that monster could push it to 80bits with further improvements.

Original archived Re: Bitcoin puzzle transaction ~32 BTC prize to who solves it
Scraped on 04/06/2025, 12:55:38 UTC
The most amusing stuff using mutex locks and creating bloomfilters with the same inputs two times in row.
Code:
alexander@alexander-home:~/Documents/Test_Dir/Point_Search_GMP$ diff bloom1B.bf bloom1.bf
Binary files bloom1B.bf and bloom1.bf differ

What did you expect? I looked at your update, and you are simply creating multiple mutexes, one for each thread that runs process_chunk. And locking the entire loop. Basically protecting nothing.

That's not mutexes are for. You only need a single mutex, and you only need to lock the "bf.insert" call, not the entire loop (or else the entire loops will be exclusive).

I'd personally move the mutex to the bloom filter code, and further block only the actual code that accesses data which can potentially be shared (for example, the hashing part probably doesn't need exclusive access).

But I'm glad that at least you got to a case where you can clearly see that the output is wrong, when synchronization is missing. So which one of those 2 outputs is the right one? You'll never know, since they were basically in a race condition, running both in parallel under different mutexes (so, identical as not having a mutex at all).

If you wanna go fancy you can implement a multi-mutex scheme, one for each some memory area size, and only lock the specific mutex for the area the bloom filter writes. This may increase throughput, or it may not, the right balance needs to be found by trial and error. But this is not a programming thread, after all. Smiley

LE: another option is to compute the points in parallel, and queue them in a producer-consumer fashion. And consuming the queue in a single thread, that only does the BF insertions. This simply moves the sync on the queue itself, of course, if you don't want to mess with the bloom class.

for (int i = 0; i < POINTS_BATCH_SIZE; i++) { // inserting all batch points into the bloomfilter
                    BloomP.x = pointBatchX;
                    BloomP.y = pointBatchY;
                    std::lock_guard<std::mutex> lock(mtx);
                    bf.insert(secp256k1->GetPublicKeyHex(BloomP));
                }

Exactly. Locking the bf like this leads to nothing. The same. May be no diff. Might be multiple.
But the running instance yields the right result even so no matter what. Has no impact after all. 69 bits. Tested.
AMD Ryzen Threadripper PRO 5995WX I guess that monster could push it to 80bits with further improvements.