[...] lose a share [...]
How are you supposed to 'lose' a share if it doesn't get rejected? Did it get stuck somewhere in the depth of the transistors?
Because you COULD have submitted 3x 128 diff shares already. If a 512 diff share should take 1m, a 128 diff share should take 15s. Therefore if it takes you 63s to find a 512 diff share (note: this would have been stale/rejected since the block changed) you could have submitted 3x 128's that were accepted in that time.
And you would almost be guaranteed to get at least one, still better than zero.
Heres something to remember:
A 2 G/h Miner at 512 diff = A 1 G/h miner at 256 diff
In terms of the amount, or intensity of this "wasting" effect.
That is why the faster miners have an edge. Their hashrate minimizes the effect. They lose smaller and smaller chunks of their work on block changes/stales/coin changes.