I think this is getting more complicated than it needs to be. If we're looking at a weighted average, like I suggested, take the first 6 blocks and then the next 18 counted as one block. I don't see how additional weight on the latest blocks would allow for an exploit. Care to elaborate?
I do agree though that the changes need to happen faster, much like I originally pointed out here:
https://bitcointalk.org/index.php?topic=554412.msg8992812;topicseen#msg8992812. The POT chart there is a solid representation of what we should be trying to achieve.
-Fuse
We basically follow the same path. The more weight you give to the newer blocks, the quicker it reacts to the hash rate change.
Neither your nor mine idea are big deals to implement; only a few lines of code in the loop that calculates the actual time and average.
From how I understand it, you are putting it as 6 new blocks to 1 averaged over the 18 older ones.
My approach is to increase the weight in steps the closer you come to the newest block with the weight of 40% or 53% to the most relevant set of blocks.
Which one is better? We can speculate but best would be to see some numbers from test cases or network tests.
The danger of putting too much weight on the latest block:
Lets put it to an extreme and you use only the last 1 or 2 blocks. A pool can easy cherry pick those blocks, leave for 2 blocks, come back... and repeat. The diff would jump up and down like crazy.
So it is important to fine tune the settings and find a good path between instant reaction and smoothing the diff changes.
(I had a quick look at NiteGravityWell and without digging too deep into it, it looks like a slightly modded KGW.)