I heard from one of the Nicehash miners. He abandoned p2pool because the DOA rates he was getting from Nicehash were significantly eating into his revenue, and he can get more by solo mining. I can think of three ways of addressing this issue.
The first one is to submit DOA shares to the network and have them compete with non-DOA shares, and possibly change the preference algorithm on the receiver's side to give DOA shares a significant and fair chance of winning the subsequent orphan race. This approach would be likely to increase the possibility and severity of selfish mining attacks, so I'm a bit reluctant to pursue it. It would be pretty simple in terms of code changes, though. Some more detailed thoughts on this below.
The second one is to add an uncle mechanism or DAG structure to the share chain. This would likely be much fairer, but it would require a p2pool hard fork and a substantial amount of new code. This would be the best option if developer time weren't an issue.
The third one is to try to increase the average share interval by increasing the default or minimum share difficulty. Changes to the minimum share difficulty require a hard fork, but a simple one. Changes to the default share difficulty just require that some nodes (especially the large ones) run different code. If the average share took 4x as long, then both orphan and DOA rates would be 1/4th their current values. Drawbacks of this approach are (1) that small miners would find it more difficult to get shares into the share chain (unless share difficulty were a function of the number of shares in the chain), and (2) that it's potentially not incentive-compatible, as the benefit of making high-difficulty shares goes to everyone who mines afterwards, and not to the person who mines the high-diff share. A person with a lot of hashrate has a greater chance of winning an orphan race if he works on low-diff shares than if he works on high-diff shares, since p2pool essentially just uses (1) the share height and (2) the time of arrival when determining which share to base its work off of, and ignores difficulty.
But maybe that isn't the best way to choose the share to base work off of? Ultimately, the problem we're having with fairness comes from the fact that some shares have greater expected revenue in the share chain than other shares, due to differences in orphan risk caused by differences in node performance and propagation times. If we no longer take time-of-arrival into account (or weight it less heavily), then the incentive to run a large, high-performance p2pool node dissipates. That might sound like a bad thing, since it could result in a higher block orphan rate for p2pool, but it's not really that bad. The issue comes from the fact that Bitcoin block intervals are 600 seconds, but shares are 30 seconds, which means that we are punishing low-performance nodes by 20x more than is fair.
Another problem with the existing algorithm for choosing which share to base work off of is that all shares are treated the same, regardless of their difficulty, which makes low-diff shares have an advantage for selfish mining attacks. Large-diff shares contribute more work to p2pool, so they should have a greater chance of winning an orphan race, ceteris paribus. But you can't just pick the highest-diff share of all shares in an orphan race, since that would cause a race-to-the-top scenario where the optimal difficulty to use for your shares is the highest diff that a competitor uses plus one. That would also allow a selfish miner to reliably win orphan races against siblings: instead of working on a child share of the best known share, you could just orphan it by creating a sibling with +1 difficulty. So it needs to be probabilistic.
The algorithm I'm thinking of for choosing the winner of an orphan race is this:
1. Take all of the shares at the best chain height (using the current 5th-gen-parent work done heuristic)
2. Find the time that the first of those candidate shares arrived, and call that start_time.
3. For each share, calculate the difference between that share's arrival time and start_time, and call that share_delay.
4. Calculate share_nonorphan_chance = 1 - share_delay/600 (for Bitcoin) for each share. This gives a correction factor for how much less likely than the best share this share would have been to not be orphaned as a bitcoin block. That is, a share that took 6 seconds longer than the fastest share would get a 1% higher orphan risk, and would get a share_nonorphan_chance of 0.99.
5. Calculate share_corr_share_diff = share_diff * share_nonorphan_chance.
6. Pick a share randomly, giving a chance of share_corr_share_diff/sum(share_corr_share_diffs) for each share.
7. Mine using that share as the parent.
If my math/logic is correct, this algorithm would give pretty much everyone a fair chance of winning a share orphan race, regardless of their node performance or share difficulty.