Also, the jumps (each jump) aren't stored. All the kangaroos are jumping via jumps but only those that land on a DP (leading 0s with this program) are stored.
Those are the ones I was asking about.
I think for 115, it took a total of 2^58.36 total jumps and the workfile(s) contained 2^33.36 DPs (DP 25); and it took right around 13 days to solve. 130 at DP 32 would contain a little more than 115 at DP 25; but even if you doubled it, you are looking at 600+GB, triple it 1TB, quadrupled it 1.4 TB, etc. it is still no where close to 1 exabyte of storage required.
I took a look into Hashmap.cpp and header.
It is storing 16 bytes for X (out of full 32) and 16 bytes for distance (out of which 3 bits are used for other things as seen in header, so 125 bits).
In these conditions yes, you can fit 2^33.36 points in 300 GB. But is there a fine print about the possibility of hash collision on those missing 129 bits in the DP? You have a missing Y sign bit hint and 128 bits of lost information about the X coordinate. Are these factored in into the probability? I see nothing about this in the README.
There is also the 125-bit distance issue. So making "shortcuts" into whatever is stored affects some of the estimations, and in thie #115 case it is translated into exceeding the number of operations that were required (2^58.36). This was a trade-off between storage and time which I don't see documented.
Distances do not really increase in size, in the same range/group. They are spread out randomly at program start and then, for say 115, they have average jump size of somewhere around 2^57ish; so each jump is about 2^57, but inside a 2^114 range, so in order to really increase, each kangaroo would have to make more than 2^57 jumps (which would take a lifetime).
Since target range gets offset by target start, we can just think about a distance from 0 to....? and here's the catch: it really depends on how many jumps on average a kangaroo would need to make, in order to fill the DP table.
Yes, using a single kangaroo would basically traverse a lot of space to fill the DP table.
Using 2 kangaroos halves the distance, 256 kangaroos gets rid of 8 more bits of the distance, etc.
But yes, a larger range would have more hex characters stored for the distance (which would take up more storage space), compared to a smaller range, but the points (DPs) would all be the same size/length, regardless of the range. So 130 compared to 115 range would equal around 3-4 more hex characters per distance.
Yes, a distance of 0 should take up the same space as a distance of 2^max. So the individual entry size needs to increase, to accomodate.
If you look into the Hashmap.cpp (since you're a guru in this software), answer me these:
1. Why is the file loaded into memory? You almost state you can just read/write from the file to do DP storage/lookup operations, so why is it read and operated in the RAM?

2. Would you say the solution (collision) was found during a file merge from 2 different 150 GB files, or after a GPU loop? If the latter, that would mean the hashmap file was scanned and written to, for each DP, which by all means sounds unfeasible, at least for the simple structure it uses.