I'm not quite sure I understand the problem, but let me suggest some stuff for the variation problem. I'm not a statistician, but it seems like 2 standard deviations from the mean is a good trigger. How about you take the last 30 samples, compute the mean and standard deviation. If the new sample is more than 2 standard deviations from the mean, then force a retarget. In a random process (as I remember it), >95% of samples should be within 2 standard deviations of the mean. That means that less than 5% of retargets would be from a random event, and 95% of the would be triggered by changes in hashing power.
^This!