Distributed Random Brute Force
I don't have the GPU power to make any progress with sequential brute force.
I also found, by experiment, that guessing random numbers can take much longer.
So, to maximize the fun, I am doing both.
I distribute the scan over the 20-3F keyspace, pick 3 random bytes, and brute force the rest.
My ranges look like this: (where XXXXXX is 3 random bytes)
[20-3F][XXXXXX]00000000 - [20-3F][XXXXXX]FFFFFFFF
My old card can try fifteen 3-byte randoms per scan, every 13 hours, at 44Mkey/s. Plus about 2 million really random randoms with the leftover starting points.
What does that get me?
15 random blocks of 4.3 billion keys in each of 32 sub-ranges [20-3F] per scan = 2 trillion. 4T/day. Pffft.
so, every 26-hour day, scanning the following:
128 B keys in 20XXXXXX00000000-20XXXXXXFFFFFFFF
128 B keys in 21XXXXXX00000000-21XXXXXXFFFFFFFF
128 B keys in 22XXXXXX00000000-22XXXXXXFFFFFFFF
.
.
.
128 B keys in 3DXXXXXX00000000-3DXXXXXXFFFFFFFF
128 B keys in 3EXXXXXX00000000-3EXXXXXXFFFFFFFF
128 B keys in 3FXXXXXX00000000-3FXXXXXXFFFFFFFF
i just need to get lucky with 3 bytes. how hard can that be?

or, get lucky with 2 bytes, but wait a week to find out. that shouldn't take much more than 65535 weeks.