Yes there is a propagation delay for larger blocks
There's a delay regardless of whether or not two different blocks are solved at the same time?
when two blocks are produced by different miners at roughly the same time larger blocks are more likely to be orphaned.
You mean that when two different blocks are solved at the same time, the smaller block will propagate faster and therefore more miners will start building on it versus the larger block?
...increases the risk of an orphan by 20%
Is there a straightforward way to estimate the risk of an orphan?
As the subsidy become a smaller % of miner total compensation the effect of the distortion will be less. There has been some some brainstorming on methods to remove the "large block penalty". It likely would require a separate mining overlay.
Even with a separate overlay, two blocks solved at the same time is a problem. And I would imagine that adding a new overlay is an extreme solution to be considered as a last resort only.
...any attempt to find an "optimal" block size is likely doomed because it can be gamed and "optimal" is hard to quantity.
What are your thoughts on
the last scheme I described?
...
2016 - 2MB block =~ 720K daily transactions (262M annually)
...
Hmm...this seems problematic. If the transaction volume doesn't grow sufficiently, this could kill fees. But if the transaction volume grows too much, fees will become exhorbitant.
IF we accept that max block size needs to change, I believe it should be done in a way that decreases scarcity in response to a rise in average transaction fees.
There would be some variety, surely. In the blocks they produce themselves, miners will search to optimize the ratio (time to propagate / revenue in fees), while in blocks they receive from other miners, they would rather it be the smaller possible.
Sure, a miner might "rather" received blocks be as small as possible but since there's no way to refuse to receive a block from a peer, this point is moot. They could drop a block that is too big once they get it but this doesn't help them very much other than not having to forward it to the remaining peers. And even this has little global effect since those other peers will just receive it from someone else.
These parameters are not the same for different miners, particularly the "time to propagate" one, as it strongly depends on how many connections you can keep established and on your bandwidth/network lag.
Bandwidth will be the limiting factor in determining the number of connections that may be maintained. For purpose of analysis we should assume that miner's choose degree (number of peers) such that bandwidth is not fully saturated. Because doing otherwise would lead to not being able to collect the largest number of transactions possible for the amount of bandwidth available, limiting revenue.
Do people in mining pools even need to run a full node?