why it is more efficient than bruteforcing.
Because you need to go through a maximum of 28453041475240576740 combinations to solve puzzle 68.
It's not more efficient on the CPU. But if you had to choose between 2,000 and 30 GPUs, which would you choose first?
I don't really understand what you mean by that.
First of all, aren't there 67 unknown bits in puzzle 68? Your number is comb(68, 34), when it should be something like comb(67, 34).
Next, let's assume you nailed it and the jackpot is flipping bits from one of the comb(67, 34) patterns.
This means you need to hash from a private key (I do not see any way to bypass this part, because you have some "randomly" 34 bits key that you xor/whatever with some static key).
Now, let's assume that creating a public key from a private key is 50 times slower than incrementing a public key by a known constant (like, maybe, G).
This "50 times slower", bear in mind, includes speedups like precomputed windows for fast multiplication of a scalar with the generator point.
Let's compare the expected gain in solving (flips versus full scan):
comb(67, 34) * 50 / 2**67
Result: 4.820132715824374 slower.
Did I miss anything obvious? I cannot see how would anyone be able to claim this as being faster, unless there's some way to go from private -> public much faster than doing an EC multiplication every time, for every flips combination.