Thanks for the response.
Curiosity question; do you notice a speed difference if you have say a 1Gb bloom versus the full 8Gb bloom you are running? Or is it the same speed no matter how large/how many h160s the bloom contains?
I will try and start small; create a smaller bloom first (30M h160s), run it, see if I notice a speed increase and make sure I understand/get the process right.
Typically there are 16 tests, whether its a 512mb bloom or a 128gb bloom, so the speed would be the same.
The larger bloom just means a larger canvas, so when you throw the dart ( your guess ), its better to hit white-space, than a dot indicating a mark,
The combinatorial nature of blooms is that the number of tests is a quality factor. The thing is once you get above 4gb, then you have a chunk problem as linux memory only support 32bit address chunks, typically with bloom the 256 number or private-key would be four 64 bit address into the bloom, which is normalized with remainder function ; Each of the 4 has its own permutation as the 64bit can be rotated 64 times, so you could in fact have 4 * 64, or 256 tests, but 16 is fine, here of the four sections, just creating 4x deterministic markers in the bloom. For a 8 gb I just do 2+2 using the lower 1/2 of the 256 bit entry for the low/high halves, you could do the same for a 32gb bloom
With only 300M bitcoin addresses I have that the 8gb is fine.
I find that for most people the problem here is approaching the data management problem, of creating the 300M h160 hex addresses, so they have data to operate on, first they need to have a bitcoin-coin full node with tx, up & running, they need to install the python rpc routines, and run the component source, but most people can't even find the on/off switch on their computer, let alone running the full node.
Impossible to provide data as github limits 100MB for a project. The total data I use is about 10TB for hacking-btc, but given that CHIA mining needs 500TB, then 10TB today sounds like baby data.
Then of course once you have 300MB you have to sort and make the addresses 'unique' in order to build binary sort engines 'xxd', the bloom filter only can ascertain 1 in a trillion-trillion, but to know absolute you need to do a binary search with minimal computation you can't use 'grep' for looking up an address. But sorting a 300M file is a major task on a PC, you need fast cpu, and large RAM
Once you have the data, and have created the .bin files & .blf files, you can run the miner, the .blf get you that 1 in 10E24, but for a yes/no you need the .bin ( xxd );
I used to just ring a bell everytime an address was found that had btc, as I have enlarged the bloom-filters beyond 16gb, and extended the blockchain database scraping to all btc addresses every used, I find very little noise, so now can just have the few addresses found just 'sweep' the key with electrum-server. But most of the time its just dust, which is to be expected.
I think this approach, will be most useful once the ETH people repurpose away from ETH mining, to btc hacking, as a rack of rtx-3070's can do an enormous amount of hash like 2,000MH/s * 300M, which is 600,000TH/s, times 6 rig, 3.6 EH/s, 3.6 million TH; Somebody with a room of gpu rigs, could really clear out bitcoin;
I'm still just running gtx-1060-3gb at 250MH/s * 300M, which is 75,000TH/s, still better than running a btc miner at 80th
I suspect that once a few people take this approach, there will be a massive loss of trust in bitcoin and the dev will finally get of their arse and make it stronger, but perhaps not, perhaps like ostriche they'll just stay in denial, and watch everything just dissapear.