MSI Z270-A PRO with 7 GPU used to run fine for almost a year. Suddenly yesterday I had ~50 restarts so I tried to figure out what is going on.
Didn't find any suspicious change notes so I started to review my hardware.
First: It is mining well for about 4-10 minutes then it shows error "gpu fault detected 147" after this error all GPUs still mining for about 10 seconds then instantly reboots with command: "sysrq resetting" which isn't any watchdog of Claymore.
What I tried already: reduced OC, increased voltage, different PSU, brand new risers, checked all cables if they are burned or heating up fast (all were fine, healthy and cold) and it's no change.
Guys what has happened, have you any suggestions? My script which worked fine for some time looks like this:
-wd 1 -r 1 -epool eu1.ethermine.org:4444 -ewal wallet.$rigName -esm 0 -epsw x -allpools 1 -dcri 14
Strange I had the exact same issue from tuesday. A rig that was working perfectly from months, suddenly reboot every few minutes with these kind of errors. I have removed one of the cards, and now it's more stable. So may be it's indeed a riser issue. No time to check further for now.