The only workaround to the Pascal TLB bug is to run at a higher clockspeed, but that means more power usage, as you discovered. Unfortunately I can't do this on my P100 16GB cards since all the clockspeeds are locked because it's a 'datacenter product' according to NGreedia.
you should be able to change clock speed on dc product with nvidia-smi, at least I have been able to in the past unless they locked it in later drivers..