regarding custom compiled bitcoind for "optimal" p2pool operation
he can run his pool the way he wants to. right now, that's the best way to run a p2pool pool.
you can either a) fix p2pool so that having a bunch of transactions doesn't cause you to get double, triple, or quadruple as many orphans, b) implement a stop-gap like i proposed, or c) stfu
This is an interesting point: should p2pool be changed so that we take away a miner's freedom to choose what transactions to accept in order to help the bitcoind network?
Summary: by custom tweaking bitcoin code, you can create 0 transaction blocks with tiny p2pool latency. That is great for the individual miner that does it (higher efficiency) but horrible for the bitcoind network (and for other p2pool miners who are "honest"). Galvin Anderson did some calculations to show that the 0 tx strategy is not profitable compared to accepting transactions for solo (or regular pooled) bitcoind mining. I wonder if it's true for p2pool...
Whether it was written in C or python (or intercal), this would be a problem and is a fundamental question.
Simply fix bitcoind instead of focusing on p2pool. Giving a block template for p2pool to use should be nearly instantaneous. The problem seems to be that bitcoind rebuilds the whole block template from scratch on each request instead of preparing it in the background. Fix that and nobody will have to tune bitcoind anymore.