Yep same issue. On my volumes it lives for not more than 5 minutes than this. My software detects this within 2 seconds, cleans it up, kill/restarts pushpool. ulimit -n is 32k. Not helping much...
The best I got with it is 5% of stales which is not acceptable. Back to older ways it is than, for now at least. Will see if I can find time to reimplement it with Erlang.
You're getting the stales because you're killing pushpoold. Pp keeps a log of what work it has issued and when a miner submits work to a newly restarted pushpoold, it won't accept the share because it is not in the pp log.