I'll take that as confirmation Monero can verify 5K tps on something like a reasonably priced server. What about 120k? ~25 Xeon cluster?
In NoodleDoodle's performance commit he noted a benchmark of 2.5ms/tx on i7-2600. That's 400 tx/sec on a 2011 desktop. A reasonably priced current-gen server (say dual-Xeon 10-core CPUs) is probably several times faster so close to 5K/sec, but I don't know the exact numbers. There is more optimization available still (we aren't using the most optimized elliptic curve asm library available from Bernstein for example, just his sort-of-optimized C library).
With the move to ringCT, it will probably be different (though some of the differences will offset, such as having fewer outputs/tx), and we will have to reevaluate.