I agree with you that they should be able to do it with a reasonably powerful computer -- one that could be purchased for under $20K. This is why I think their system is just inefficient. as I said in the above post, you can easily write an inefficient algorithm that just process orders really slowly, and gets exponentially slower as the order book grows.
I don't know for sure that this is what their problem was though. Just speculating. If you are interested in how to implement a high performance one, look here, these guys are talking about creating an open source high performance engine that should be able to process thousands to millions of orders per second:
http://www.reddit.com/r/Bitcoin/comments/1c7v6z/buttercoin_open_source_highperformance_bitcoin/ Yeah inefficient code is the obvious conclusion.