Did you try to optimize the code to speed things up during mining? I suspect it goes kind of "out of sync" at times
The ethproxy protocol need the full hash sent to the pool, this is calculated with the CPU. I see that other miningsoftware have moved the cpu verification and submit code to it's own thread. This might help abit. TBMiner is not working good on low difficulty pools because of this.
my experience , in hive os 12x 3080 no lhr , the fastest core is 1 , 495.46 cuda 11.5 drivers give less performance than cuda 11.4 . You should do detailed tests in hive os and improve stability, some rigs keep getting errors while generating dag file. Manipulations with the kernel and pl do not give much change. I am attaching a screenshot of the 5 hour test.
https://ibb.co/b1d27xF