Regarding the bad flips:
Maybe the team can implement a pre-validation flip verification, so the bad flips don't land on the Short session. The long validation can remain the same.
For example some humans apply to become super-verifiers which will verify part of the submitted flips and tag them "good" or "bad" from the already submitted flips (let's say 48h before validation starts) and make a pool from wich the flips will be chosen for the Short session.
Of course the super-verifiers will be rewarded extra for their effort, like master-nodes, if you wish.
Let's kill decentralization to boost good flips ratio from 98.5% top 99%. Is that what you are saying? How many "super-verifiers" you need to try 5700 flips that are created each epoch? How much will you want network to pay to "super-verifiers" for solving 120 flips (considering 50 super-verifiers). You know that it will be deducted from total pool that is designated for all nodes? So most likely you will earn less with 95% score (with bad flips in network) than with 97% score (without bad flips in network). Will it increase network stability? No, it will decrease network safety, because each super-verifiers will know ~120 flips that will be part of short session. Will it decrease decentralization? Yes. Does it have any advantage for network other than less people that are complaining about having bad luck (probability ~1:50)? No.
OK, so you are saying that only 1.5% are bad flips, I doubt that. Half of them are shit flips, I report more than half in the long session, flips with no relation to the words, flips with words or numbers on the images, even in chinese/japanese or korean, flips with the same image start to end.
Maybe my "super-verifiers" idea was not a good one, at least for you, but we need to try something about.
P.S. I don't care about network growth and decentralization if this is done by bots and/or idiots (the one who makes bad flips).