We've solved this issue at
BitPenny (
website) by providing an open-source client and setting the difficulty to 8. This keeps the number of submitted shares manageable while allowing users to get latency-free work locally as often as they wish, even with an array of fast GPUs.