Post
Topic
Board Project Development
Re: List of all Bitcoin addresses ever used - currently available on temp location
by
LoyceV
on 01/02/2021, 17:44:59 UTC
Quoting myself:
Code:
Old code:
time cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2 | gzip > newchronological.txt.gz

Code:
New code:
time cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) > new.alladdresses_chronological.txt
But it's wrong
I tried again:
Code:
time cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2 > oldcode.txt
time cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) > newcode.txt
The files were too large to diff, so I split them in parts (10 million lines each). There are a few differences:
Code:
< 3BS6oQKHDwrzz4RC69iAbSV13xpbGZvXLj
---
> 3BS6oQKHDwrzz4SC69iAbSV13xpbGZvXLj

< 17Q7LN9nCmS6HdjkDj3C4MdhduFobGp4hv
---
> 17Q7LN9nCmS6HdjkDk3C4MdhduFobGp4hv

< 1rVH156qu1djPVFGoKaZ29Kw8zEpmh283
9863597a9863597
> 1rVH156qu1djPVFGoKaZ29Kw8zEpmh283

< 3Kw9pkLTLExTd9LZW2qbbNUdZRpUW3JTac
---
> 3Kw9pkLTLExTd9LZW2qbbNUdZSpUW3JTac

< 1Q7NSpgjxDHTTPUkGskTNDioCYw6MQazBG
---
> 1Q7NSpgjyDHTTPUkGskTNDioCYw6MQazBG

< 331xujHAg6AGvKzPwUKZ9AJxukaemCXeRw
8496039c8496038
< 3PLoa4ccMdxyGY6mAEStSu45xwqdRftd1b

> 3PLoa4ccMdxyGY6mAEStSu55xwqdRftd1b
10000000a10000000
> 3B92y4bFFPZvjviNhtLWeBKoYXmHVwr3CD

< 3B92y4bFFPZvjviNhtLWeBKoYXmHVwr3CD
7735278a7735278
> 331xujHAg6AGvKzPwUKZ9AJxukaemCXeRw

Let's highlight this one:
Quote
< 1Q7NSpgjxDHTTPUkGskTNDioCYw6MQazBG
---
> 1Q7NSpgjyDHTTPUkGskTNDioCYw6MQazBG
The first one (with x) is correct.

I have no idea what causes this. I'm now checking if I can reproduce the exact same data change, or that it's caused by hardware failure.