Post
Topic
Board Project Development
Merits 2 from 1 user
Re: List of all Bitcoin addresses ever used - currently available on temp location
by
JustHereReading
on 12/01/2021, 13:14:39 UTC
⭐ Merited by Quickseller (2)
We then read the big list line by line while simultaneously running through the list of new addresses and comparing the values in O(n + k). In this case we can directly write the new file to disk line by line; only the list of new addresses is kept in memory.
The problem with this is that running through a 20 MB list takes a lot of time if you need to do it 1.5 billion times. Keeping the 20 MB in memory isn't the problem, reading 30 quadrillion bytes from RAM still takes much longer than my current system.

(...)

I might be utterly mistaking, but hear me out:

Given two sorted lists:
n = 1 5 10 11 12 13 14 15 16 19 20
k = 3 6 18

We can read n from disk line by line and compare it to the current position in k.

1 < 3, write 1 to new file.
5 > 3, write 3 to file.
5 < 6, write 5 to file.
10 > 6, write 6 to file.
10 < 18, write 10 to file.
11 < 18, write 11 to file.
....
16 < 18, write 16 to file.
19 > 18, write 18 to file.
Nothing left in k, write 19 to file.
Nothing left in k, write 20 to file.

That's n + k instead of n * k, right?