The 1MB limit will almost certainly be raised but using another sanity cap is a good idea. Optimally it would be some floating cap which is deterministic and based on blockchain usage but that may take some time to develop and test. In the interim raising to say 10MB gives the network breathing room while limiting the scope of an attack.
Would it be reasonable to calculate the floating cap each time difficulty is retarget (every 2016 blocks)? You could set the max block size to, say, 4 x avg_block_size over the last 2016-block period.
Also, perhaps it would be possible to further reduce orphan costs (to the extent that miners are cooperative) by establishing informal "best-practices" for filling each block. The risk of an orphan to a particular miner is reduced when the blocksize variance is minimized. When I look at the blocks roll in, it seems that miners are already working to minimize blocksize variance to a certain extent, but perhaps they could take it one step further. It could be informally agreed upon, for example, to use a proportional feedback controller to determine the block_size for the current block you're working on:
block_size = avg_block_size +
K x (unconfirmed_transactions_kB - target_unconfirmed_transactions_kB)
where
K is a gain parameter that would be loosely agreed upon. Miners that aren't following these guidelines wouldn't be punished, but it would be clear to their hashpower providers what was going on.