Well it looks like I lost a hash board this morning. I got warnings from the pool, then looked and SM0 was registering 0 frequency.
This event is visible from the mining status page:
E101
Slot0 Chip Reset Error
Warning and retry
3
Fri Mar 29 10:59:42 2019
cgminer
I've rebooted it and it seems to be in a mode where CGMiner keeps restarting. It the elapsed time never goes above 1 second.
So now SM0 shows zero effective chips while SM1 and SM2 show 105 chips. But none of the boards are hashing.
Next I power cycled it via PDU (I'm not onsite). It's still in the same state. Namely SM0 seems dead and all mining has stopped.
Any ideas what to do next?