The difficulty can be trivially lowered, just by using more than 20 minutes of delay between regtest blocks, no matter how huge the network difficulty will be. Another interesting thing is if the difficulty will be bumped for example from 0x207fffff to 0x1f00be2e, then guess what: even if you use hours between blocks, then it won't decrease back to 0x207fffff. Do you know why? And why it bounces back and forth, even if the time between blocks is set to 15 minutes, instead of constantly decreasing?
It retargets at 288 if it theoretically takes 1-2 months on your same settings after you hit block 288. (Stick with the punishment and see how it corrects)
It should agressivley retarget possibly even lower than your target.
It looks like your algo's doing what you told it to. Your deterministic variables are elapsed time. ± is overshoot or undershoot. You just way undershot it , and I think you got the book thrown at you.