Yes you missed it. I have
already explained the math for why there is no relationship between an
average orphan rate and a cost per byte (which is a necessary prerequisite for a supply curve as explained by @Peter R's paper). In other words, if we model that every miner has the same orphan rate, then there is no level of block size which is more or less favorable to the miner, because difficulty adjusts to the same level for all the miners. In other words, the wasted hashrate of the orphan rate is paid for by income from the block because all miners have the same costs (@Peter R's model inherently presumes that hashrate and propagation delay are uniformly distributed).
It is only when we model relativistic orphan rate (which @Peter R's paper doesn't even consider) that we will see any relative profit levels and able to model orphan rate as a cost. @Peter R should have realized this, but he apparently didn't quite realize that his model was meaningless although he did mention some of these issues that perplexed him in his concluding remarks.
I think, that I now understand the issue. Let me put it into my own words and see if we are on the same page...
In equation (3) the miners expected revenue gets discounted by the chance that his block is orphaned (1 - p_orphan). That is, the rewards are only counted for the blocks that make it into the blockchain. So, if a miner for example produces 10 blocks per day with an orphan rate of 0.2, he will get the rewarded for a total of 8 blocks.
However, this perspective turns out as incorrect as block production rate will ultimately depend on the block difficulty which is automatically adjusted according to orphan rate and hash rate of all the miners.
What actually matters for a miner's block production is thus not how many of his (produced) blocks make it into the chain, but how many of the blocks in the chain are produced by him (production share). This is not the same thing! The mining rewards (fees+block rewards) must be multiplied by the miner's production share since the total block production (per time unit) remains constant due to automatic difficulty adjustement.
Let's make a simple example with only 3 miners A, B and C who all have the same hash power 1/3 and the same orphan rate 0.01. The miners don't engage in any form of selfish mining strategies. It's easy to see that under these conditions every miner will build 1/3 of the blocks in the chain, as everybody has the exact same chances.
Now, let's assume that A starts building bigger blocks so that his orphan rate increases to 0.2, while B und C retain their orphan rates of 0.01. To determine the fraction of the blocks (in the chain) built by the respective miners, we can calculate:
A: (0.8*1/3) / (0.8*1/3 + 0.99*1/3 + 0.99*1/3) = 0.288
B and C: (0.99*1/3) / (0.8*1/3 + 0.99*1/3 + 0.99*1/3) = 0.356
And we see that B and C can now build more blocks of the chain than their relative hash rates.
According Peter R's equation (3), their success rate would be 0.99*1/3= 0.33, which is incorrect.
Mining is a relativistic game!