Time-related words like "old", "new" and "long after" only make sense if you have an existing blockchain with which to tell time. So the circularity is still there.
You do have an existing blockchain. The bitcoin one, up to now. And you can tell time in number of blocks. The genesis block was the first checkpoint. We could have the hashing power vote to checkpoint block 10,000, including the patricia tree hash of the utxo set up to that point. Then anything from before the checkpoint can be ignored, since the checkpoint can be considered part of the PoW consensus mechanism - if you trust the PoW generally to make ledger updates, then (conceivably) you can trust it to checkpoint.
because any transaction data that you "compress out" is transaction data that can't be validated by new nodes.
Exactly the point. Node's already trust that blocks are valid because they have PoW on them. The checkpoint will have PoW too, and hence be trusted in the same way, relieving the new client from having to validate anything before the checkpoint. That's the point, it's like a new genesis block, plus a utxo set.
From what I understand, headers first doesn't affect the new full node sync time at all. Please correct me if I'm wrong
It does, for two reasons:
- By downloading the headers first, you can quickly (in low bandwidth) eliminate stales, orphans and bad chains.
- Once you have the headers, you can download full blocks out of order from multiple peers (currently blocks are downloaded sequentially from a single peer, which if you get a bad one, can be extremely slow).
You're right that the time taken to validate the correct chain is unaffected.
Good point. Parallel downloading is awesome. But the CPU still has to crunch ALL those EC verifies. ::sigh::