I have a rig that is crashing every 3 to 4 hours and I cannot pinpoint the reason why.
The crash affects somehow the ethernet interface (or maybe the TCP/IP stack), and the rig becomes unreachable by ssh and doesn't respond to pings.
It basically needs to be reset manually. This is quite annoying as I live several hours away from where I have the rigs.
Is there a way on how to investigate the crash and how to find a solution?
We are working on adding a reset network manager or reboot if rig cant access router to watchdog.
Hopefully it will be added to v0019-2.0 soon.
v0019-2.0 is almost done, just some last edits to watchdog, then we will announce it.