If diff > 3, then every block can be an anomaly, and therefore all an attacker needs to do is raise the difficulty above 3 while holding 51% of the hashing power (setting the difficulty as desired would be fairly easy - again by abusing KGW and nTime), to ensure every block is an anomaly.
Aren't you glossing over the
all an attacker needs to do is raise the difficulty above 3 while holding 51% of the hashing power a bit? Exploiting this aspect of the flaw is out of the reach of the vast majority of miners, and even so, not a lot of attackers could use it simultaneously. In practice, wouldn't this just result in an arms race among attackers vying for the 51% making the whole thing not profitable?