There is no possibility for non-convergence. The most-work tree eventually wins out. Always.
There could be an extreme scenario where two fractions of the network work on different trees, and each fraction uses anti-DoS rule that rejects blocks from the tree of the other fraction unless they arrive in a batch that proves that the PoW of the competing tree wins, but because of propagation lag each fraction solves more PoW blocks in its own tree so by the time the batch arrives from the other tree, that batch is already losing and therefore rejected. No?
Iddo, a collapse has to occur. This is exactly the same reasoning as in Theorem 8.6 in the paper: block construction is stochastic, so the two forks are not synchronized perfectly even if they have exactly the same amount of hash power supporting them. The differences in block creation times drift apart until they grow larger than the time it takes to send the message about the total weight in the branch. Thus, a collapse eventually occurs.
Right, apologies, I agree that this will eventually re-converge. But there still could be an interesting question here regarding whether it takes a long time for the two fractions to re-converge, depending on the anti-DoS rules that they deploy.
With anti-Dos it's highly unlikely that you get into these sort of forks in the first place: you do accept all blocks created, say, in the past month or so (difficulty is high enough that they are not really a DoS attack). A fork that breaks DoS rules will only form if 1) an attacker manages to keep up with the network's rate for a whole month (has to be a 50% attack) 2) the network is disconnected for over a month. Even in these cases, the forks resolve rather quickly.
Besides, another heuristic can also help here: accept alternative branches into the tree even if they have 80% of the total weight of your current branch. This is still enough to stop a DoS, and certainly enough to make DoS forks collapse instantly.
Yes, those can be sensible anti-DoS rules. But if the node accepts all blocks in the past month, we may also need a rule to delete the old blocks in the losing subtrees and thereby reclaim disk space. As Andrew Miller said in #bitcoin-wizards, in terms of incentives there's a tradeoff here since keeping some orphans around will potentially save on future bandwidth, but at the cost of storage now.