If the attacker started to create his double-spending subtangle long time ago, then the initial tx's of this subtangle reference some rather old tx's, with not-so-big cumulative weight. While the attacker waits, the cumulative weight of the legit tangle continues to grow, so he won't be able to catch up.
Of course, this assumes that the attacker's max possible tx's rate is much less then the "usual" tx's rate of the rest of the network.
The first (legit) transaction references the same old transactions as the doublespend. The attacker doesn't need to compete with the rest of network.