I really like Meni's elastic block cap proposal.
Obviously increasing the block size requires a hard fork, but the fee pool part could be accomplished purely with a soft fork.
Absolutely. It's worth clarifying that the actual mechanics of how the pool fee works, (particularly how it is calculated), does not have to be on the critical path to resolving the 1MB problem.
All that is necessary today for Meni's proposal is to agree that a pool fee, of some make-up, can usefully exist.Then it is possible to look at a simple implementation plan which overcomes the urgency of dealing with the 1MB, does not raise the limit too much, and allows time for an elastic cap with rollover penalties to be fully worked out, modeled, developed, and tested. As mentioned before, the pool fee could incorporate a function of block delta utxo and sigops.
Phase I: Hard Fork to increase the max block size to 2T, e.g. via block version 4.
2T might be in the region of 6 or 8MB which also scales at a fixed percentage each year, say 20%, or a fixed multiple (e.g. 4x) of the recent average 144 or 2016 blocks. However, the difference this time is that no miner will be able to mine a block larger than T without paying a pool fee, but this won't be possible because it requires a supermajority on version 5 blocks to vary the pool fee from zero.
Phase II:Soft fork to implement the full elastic cap, effective by supermajority, e.g. via block version 5, when blocks between T and 2T can then be mined.
Advantages:
Decoupling dealing with the hard-limit, from the making of a graceful decay in network performance as the limit is approached.
No urgency on how to best to set the pool fee, lots of time for debate and modelling.
A yearly scaling percentage can be more approximate because it should be easier to schedule hard-fork revisions to this as and when changes in global computing technology dictate.
Since the blocksize issue is controversial and may take some time to settle, we are better off implementing this elastic cap right now with a softfork (your phase II) and skip the hardfork part (phase I).
We could do that by choosing T=0.5MB (2T = 1MB = current maxblocksize).