During an LP, the system is trying to hand out a ton of work at the same time, so there is an occasional bottleneck there.
I'll instrument the code I'm running to check. One thing that might help is just increasing the number of front ends you're running, even on the same systems.
I haven't hacked on pushpool for a while, but when I last did it had a lot of blocking IO.... so a few slow sockets could really bog down its response time. I actually have no clue what you're running but if has a similar design you might get better throughput just by running more processes. (You could load balance with some OS level rounding-robbining of the sockets or just by giving users more ports to connect to)