In NoodleDoodle's performance commit he noted a benchmark of 2.5ms/tx on i7-2600. That's 400 tx/sec on a 2011 desktop. A reasonably priced current-gen server (say dual-Xeon 10-core CPUs) is probably several times faster so close to 5K/sec, but I don't know the exact numbers. There is more optimization available still (we aren't using the most optimized elliptic curve asm library available from Bernstein for example, just his sort-of-optimized C library).
With the move to ringCT, it will probably be different (though some of the differences will offset, such as having fewer outputs/tx), and we will have to reevaluate.
Sweet!

When I tell people about this good news during my Monero evangelizing, how do I explain why our sig_ops are so much faster than Old Grandpa Bitcoin's?
As I understand it (not a cryptographer), there is some inherent performance benefit to the curve25119-based cryptography (or maybe to implementing it on real hardware), but I don't know the magnitude nor how close either implementation is to optimized enough for that to matter.
In reality I think raw signature verification performance is really one of the least important scalability concerns in practice, currently, and both implementations are reasonably optimized once libsecp256k1 is integrated into Bitcoin Core (already done, but I think not released yet).