There is one big difference. If I get the list of pending transactions from a pool node's memory pool, I have no guarantee that the node is being truthful. It may show a tx paying me as being in the mempool, while behind the scenes it is hashing away on a different (double) spend of the same utxo. However, a sharechain provides an unforgeable proof that one or more of the participants in that chain was actually actively mining my transaction.
The sharechain doesn't contain the entire merkle tree and thus you can not verify what tx are being worked on by a share in the sharechain. Each share has to be as small as possible (to improve efficiency). It doesn't contain anything more than the bitcoin blockheader, the coinbase tx, and the merkle branch which allows cryptographic verification of the coinbase tx (linking coinbase tx to merkle root). Actually this is a simplification. It contains subsets of those elements because there are portions which can be independently computed by peers in the p2pool network and thus don't need to be relayed (i.e. the output of the coinbase tx). p2pool shares don't contain anything more than what is needed to prevent cheating by p2pool peers. For a p2pool share to contain all the tx hashes of the bitcoin block which would make them very heavy.
It may be possible to allow more efficient verification of "working tx set" but it would require a radical change to the Bitcoin block structure as well as block rules. If the merkle trees were generally similar or deterministic it would be possible to provide a merkle tree as well as a instruction set on how to select the txs. Like I said this is more useful for an altcoin as it would be a radical departure from Bitcoin.
I'm pretty familiar with the working of p2pool although my knowledge may be a little dated. I hope I have clarified the points in my proposal that created confusion.
There is some merit outside of double spend proof for pools using p2pool as a backbone however there is also a tragedy of commons. Say p2pool represented 30% of the network hashrate this would mean any entity regardless of their actual contribution has the reduced variance of someone with 30% of the hashrate. Yes all participants benefit but they benefit to a lesser degree. The larger the participant the less they benefit and "worse" their advantage over other participants is reduced. Established pools are able to charge a fee because they are trusted and they reduce variance. By joining p2pool they undermine those two factors which are their sole reason for existence.
To answer your question there is no technical reason why pools couldn't use p2pool (either the existing network or a private variant) as a backbone. There never has been. However there is also no "selfish" reason for them to do so. All the reasons relate to the "common good" which is a hard sell.