I know this sounds as bad as the slowing down blocks suggestion. But an alternative would be to make miners pay higher fees for larger blocks. An increasing haircut on transaction fees related to block size.
e.g. 1MB block 0% haircut, 10MB block 10% haircut, 1000MB block 95% haircut. Some polynomial could determine the formulae.
This would create a marginal cost of including more transactions in blocks. The problem with this might be that the extra marginal cost doesn't differ enough for different miners on the mining cost curve.
to me this doesnt sound bad at all.
what i'd love to see is that the additional "lost" fees are somehow distributed to the nodes transmitting the transaction (i dont have an idea how this could be accomplished without cheating though)
i know lost bitcoins are not a real problem, but i still have a bad "feeling" about it.
EDIT:
what about this:
i craft a new transaction with 1btc mining fee
- some node receives this transaction and signs it.
- it now relays the following data
# tx/txouts
# his own ip/port
# one of his own addresses
all signed by his key (not the key of the original txin's)
- a miner receives multiple versions of this transaction signed by different "first" relay nodes
- he now contact one of this nodes and ask for the original one
- same magic needs to be done to make sure the miner is forced to send some fee to the first relay if the block size exceeds 1mb.