Post
Topic
Board Mining (Altcoins)
Re: [How To] Puwaha's Poor Man's Networked PDU: Using Smart Plugs and Awesome Miner
by
akhouston
on 04/02/2018, 19:24:29 UTC
Nice and detailed write up, thanks for that.

I did something similar with SmartThings and smartplugs. Basically when AM notices any "service degradation", i.e. service is offline, device crapped out (i.e. # of devices less than expected) or "device is sick" (have no clue what that means - never saw that happened to a GPU) - then it triggers an action in webCore, which starts a 1 minute countdown timer. I get a notification on the phone with options to power cycle a rig right away or cancel. If no response is received within a minute then the rig is power cycled (if you are doing that to a GPU rig you need to ensure that "recover after power loss is turned on in BIOS).

One thing that didn't work for me is waits on the AM side - it kept sending reboot requests, so I basically created a latching switch, that allows a single reboot within a 5 minute interval - that allows plenty time for a rig to recover.

That's why I added the 5-6 minutes of waits in the action script from AM... that gives the rig enough time to boot up and start responding to pings.  I've actually added another trigger condition in my AM script, as my cheapo Celeron processor sometimes gets overworked.  So I added a Detect Offline trigger of 60 seconds... using the Remote Agent offline tickbox.  I made both triggers a "Match All" which is an AND operator.  This way if the rig is too busy to respond to pings for 9 seconds (the max that the Ping AM trigger allows) it won't reboot the rig unless the rig is detected as "offline" for 60 seconds as well.


Right, but what I'm saying, initially I thought of adding same waiting (I was surprised we were limited to 99 sec wait Smiley ) but it didn't work for me for some reason - it just kept calling the url almost continuously. From the AM examples I understand that same "wait" can be achieved with using timer and setting "match all conditions", like "detect offline + timer every 5 minutes". This has its downside so that in the worst case scenario you'll detect your downtime 5 minutes after it occurred, BUT on the positive side it will not occur again for another 5 minutes, effectively doing same thing as those wait commands.