And yes, that is more like a death spiral, too damn right! I'd have expected it to be flashing x's all over + an increase in HW errors at this point, but then again the new binaries support that --bitmain-hwerror option (or such like) that I have never gotten my head around! I think its that time to try the 0800 setting .... I am convinced it is a heat problem your rigs are encountering due to its consistency in drop-off, so reducing the voltage may help (but may need a power cycle, infact I'd say do one even though I do all my tests initially without one).
OK, I'm going to call it a death spiral now. 98 minutes into the run, 15m hashrate of 375 at the pool, miner reports average dropped to 495. All chips still "o", temps 41/41, and a total of 5 HW errors. But I would also expect the symptoms you describe with an overheat, and I'm not seeing them. No increasing temps, no "x"s on the chips, HW errors still negligible. Fan speeds are dropping, now 1800-1900, down from 2200-2300. So the unit is getting cooler as it slows down. I saw the same behavior on all six units, so we are missing something here.
On the one test unit, I will now try 250/0750, hardware reset then software reset. Will see what happens! One thing I am wondering is if we are triggering an internal chip overheat of some kind, that throttles it back but doesn't report a bunch of failures. So we try to push them harder, and they actually go slower. But I'm just guessing at this point.