If uncles was applied to bitcoin then special care would have to be taken to make sure a transaction in an uncle doesn't conflict with a transaction in the main chain.
Actually, you'd simply ignore all transactions in uncles except the coinbase transaction, which gets programmatically modified to reduce the amounts.
If uncles are added then I think the share interval could be reduced even more to (say) 15 seconds.
I'd really rather not, for CPU and RAM performance reasons in addition to orphan rate reasons. It takes a medium-slow CPU on Python2.7 up to 4 seconds to process a share and issue new work with 1 MB blocks. With 4 MB blocks, you would generally be unable to run p2pool on medium-slow CPUs with a 15 second share interval, as it would take 16 seconds to process each share. Share variance is currently insignificant compared to block variance on p2pool. Increasing the share interval to 60 seconds on average is more likely to be a good idea than decreasing it.
While there are a few major changes to p2pool's architecture that would drastically reduce p2pool's performance load, those changes are quite large and may never be done. It would certainly involve more code than just adding uncles.
There'd be no more orphans at all. This would reduce the difficulty (by a constant factor but still) which would mean if p2pool gets really big then small hashers can still solve shares fairly often.
There would be uncles instead. Uncles have a smaller reward for the person who mined them than normal shares, so there's still a potential fairness penalty to having high uncle rates. The game theory of uncles is complicated, and if you don't set the rules right, it can be in a miner's best interest to not mention other people's shares as uncles and orphan them instead. Giving the uncled shares a hit in revenue is necessary to making sure there's an incentive to include them.
Another way of solving this apart from being probabilistic...
A probably better solution is to just follow the share with the lowest hash, regardless of its difficulty. This inherently prefers high-diff shares, but is fair in that low-diff shares have a chance to win against a high-diff share if the low-diff share is particularly lucky in its hash. Basically, it is effectively equivalent to the algorithm I proposed in the post that you linked to, but where the source of randomness is the hashes themselves.