Okay, I think I get it.
It's also interesting that you could do the same kind of linking on any metric; for example the amount of coin age destroyed by a given block, if that were your measure of "best chain" instead of the number of hashes.
I don't agree with this part. The coin-age destroyed among a set of blocks mined by an adversary does not necessarily follow any particular distribution. However, you are right that there are many possible choices for determining "value" other than distance from 0. You could, for example, interpret the "value" (beyond difficulty) as the *least-significant bit* zeros instead, since conditioned on a mining attempt beating the difficulty target, the distribution of the value in this case is still as it should be.
Anyway since that's a really minor point, here's some more useful technical content. One surprising thing the attacker can do is to create a really *large* proof. The attacker does this by throwing away every block *except* the value=1 blocks, which don't exceed the difficulty target at all. This only requires throwing away 50% of the work, but it results in a proof where the "up" links are always null, and you have to traverse every damn block to validate his proof.
This isn't a problem, for the following reason. You can make progress on validating the SPV proof for multiple chains concurrently. So if one peer tries to send you a really slow proof, you should request different proofs from a different nodes and process them in parallel.
The option of this strategy is what motivates the performance goal I mentioned as a caveat at the end of the writeup. The "honest prover" should, with high probability, produce a proof that is reasonably compact. It's okay if a dishonest prover produces a long proof. Regardless of length, the prover should be unable (except with low probability) to produce a convincing proof that the amount of work done to bury some transaction exceeds the actual amount of work (including the attacker's work plus any honest work) by some fraction.
*** edit ***
Actually, now that I've written this response out, I realize another attack that's modeled in my security definition, but isn't adequately addressed in my scheme. The problem is that an attacker can, by rejecting 50% of his work, cause an honest proof to contain less useful back links. It's important to set the verification parameters loose enough so that even if the attacker does this, he can't cause honest chains to fail verification. But, if the verification parameters are suitably widened, it must also not make the ease of forging a proof much higher. All of this relates to analysis i haven't finished in my writeup, so these are things I need to address. I don't think this should affect the choice of data structure though (only the guidelines for choosing parameters).