Hi all:
Well been one hell of a move but finally back on line and mining away. Well, almost. In reading the last few pages, I have had the same freezing and rebooting problems as well. Has been a long road to find the cause of it. I did try the less overclocks but that really didn't help much. Well, I finally found out what was doing it. It was the temp control program trying to adjust the speed of the fans. Somewhere in the mix it loses the connection with one or more cards but doesn't trigger a reboot. When the utilization falls past the set point, then watchdog does print a message stating that and reboots.
So by not running the temp control and using a set fan speed high enough to keep things cool, I am running now with no reboots at all.
This all really started now that the hot weather is here. While it was cool all was good
hope this helps others. Thay
I noticed a while back that when fans goes too high (over %90) some of my rigs gets unstable and reboot or freeze too
So I tried to keep em cooler with lower power limit and also changed max fan speed in temp control from 100 to 90
From:
if [ $NEW_FAN_SPEED -gt 100 ]; then
NEW_FAN_SPEED=100
To:
if [ $NEW_FAN_SPEED -gt 90 ]; then
NEW_FAN_SPEED=90
In 1bash set
RESTORE_POWER_LIMIT=80