Post
Topic
Board Project Development
Re: List of all Bitcoin addresses ever used
by
LoyceV
on 20/08/2020, 09:08:20 UTC
Code:
cat unsorted.txt | sort -u -S 65% -T tmp > sorted.txt
I'm already using "sort", which uses /tmp by default.

I'll try "sort -u" though, it might need less temporary storage than "sort | uniq". The next update is scheduled for tomorrow, I'll see how it performs.

Quote
-S will tell your machine to use at most 65% CPU
I think you mean RAM, not CPU. This VM has only 256 MB, so I'll let "sort" figure it out on it's own.

Quote
-T puts temporary files in a directory (here named "tmp") and not in RAM; if you have an SSD, the speed isn't too shabby
That's default behaviour Smiley It doesn't have an SSD though, and I'm using "cputool" to keep server load low. I'm okay without daily updates on this, I wouldn't want users to download this large file on a daily basis anyway.

Quote
I have sorted huge lists (>80 GB) on budget laptops using these two arguments. Worth a shot! If you want better hosting, PM me.
Since last year, I'm using an AWS server donated by suchmoon for loyce.club. However, since AWS charges $0.15/GB, I'm not comfortable hosting very large files on suchmoon's server.
When I tested sorting data on AWS, it started throtting disk IO after a while, which made it very slow. I've also tested a pay-by-the-hour-VPS, and obviously it was a lot faster.

There's one thing on my wish list though: a method to show only unique addresses in order of appearance (without sorting them). It can be done with awk '!a[$0]++', but this requires a lot of memory and doesn't use temporary files.