When the getblocktemplate latency started to appear, my efficiency was still between 110-115%. My getblocktemplate latency was about 30 seconds at that time.
This is the worst latency I've seen reported so far (by nearly 3x, the worst I can rember was ~12s). With <0.8.2rc3 getblocktemplate depends heavily on you CPU speed and number of transactions in your memory pool. As you seem to have an adequate CPU, it probably doesn't explain the difference. Do you use non-default values in your bitcoin.conf?
You have relatively high bandwidth, but what is your link latency? If you use traceroute/mtr or similar tools, what is the time-to-hop for your ISP main routers and addresses in North America/Europe/China (where most nodes and probably miners are according to
http://blockchain.info/fr/nodes-globe).