@papampi:
I found a solution for starting watchdog from rc.local. Just call it like this:
(sleep 90 && systemctl restart watchdog.service)&
this will sleep a little before restarting watchdog, but it will do so in the background and allow for rc.local to terminate beforehand.
I was looking to implement the iTCO_wtd a while back. Too preoccupied with other things...
Check this for testing iTCO:
http://www.madore.org/~david/linux/iTCO-wdt-test.html@Leenoox
I seen that and it works manually but not auto starts.
@bytiges
Your 90 second sleep not works for me as I think my problem is with restart command, whatever I do it fails with restart.
I added 3 modprob modules to rc.local and now at startup watchdog showup in dev
modprobe i2c-i801
modprobe i2c-smbus
modprobe iTCO_wdt
Set watchdog.conf to ping my router
Now if I start watchdog manually and remove the network cable, watchdog reboots the rig after given time. (1 step ahead)
Tried to add a sleep to restart command between stop and start in /etc/init.d/watchdog and it still not starts and fails.
Only solution is to stop and start it again for me now.
---------------------------------------------------------------------------------
I'm sorry it did not work for you. I'll try to give you my rationale in the hope that it helps you find the issue.
So, one thing I noticed is that I cannot "start" or "restart" the watchdog right after a reboot. I log on the ssh console then issue the commands and it hangs. However, if I wait enough time after reboot I can (re)-start it and it works normally. Bear in mind I'm only testing with a rig with a single GTC1070 in it. I noticed that the rigs with 13 GTX1060 that I have in production can take much longer time to init and start mining. I have not tested watchdog there yet.
I also noticed that rc.local must have finished for watchdog to be able to start. I think that this is because /etc/init.d/watchdog has a dependency on $all. I however modified my dependencies to be only $local_fs and $network (perhaps this is it). Here's my /etc/init.d/watchdog header
#!/bin/sh
#/etc/init.d/watchdog: start watchdog daemon.
### BEGIN INIT INFO
# Provides: watchdog
# Short-Description: Start software watchdog daemon
# Required-Start: $local_fs $network
# Required-Stop: $all
# Should-Start:
# Should-Stop:
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
### END INIT INFO
....
So my fix was to edit the rc.local to add a fixed delay and a restart of the watcdog, but send it in background with the (...)& construct. This way the rc.local script has the chance of terminating. Perhaps you need to increase the delay (just see how much time after reboot you're able to start the watchdog manually).
Regarding the insertion of the modules, you don't need to make it by hand. You only need a single "iTCO_wdt" module in /etc/default/watchdog. It will be inserted with modprobe and will pull all dependent modules. Here's my /etc/default/watchdog file:
# Start watchdog at boot time? 0 or 1
run_watchdog=1
# Start wd_keepalive after stopping watchdog? 0 or 1
run_wd_keepalive=0
# Load module before starting watchdog
watchdog_module="iTCO_wdt"
# Specify additional watchdog options here (see manpage).
And for reference, here's the fixed watchdog.service:
$ cat /lib/systemd/system/watchdog.service
[Unit]
Description=watchdog daemon
Conflicts=wd_keepalive.service
After=multi-user.target
OnFailure=wd_keepalive.service
[Service]
Type=forking
EnvironmentFile=/etc/default/watchdog
ExecStartPre=/bin/sh -c '[ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module'
ExecStart=/bin/sh -c '[ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options'
ExecStopPost=/bin/sh -c '[ $run_wd_keepalive != 1 ] || false'
[Install]
I also set the nowayout option for the watchdog module, so that there's no way to stop the watchdog once activated.
m1@m1-desktop:~$ cat /etc/modprobe.d/nowayout.conf
options iTCO_wdt nowayout=1
Hope that helps.