I'm suggesting 2 killer features that will make this incredible miner complete:
1. Include in watchdog/logfile the uptime of a crashed GPU (i.e. GPU 4 crashed, uptime: 5 hours, 12 minutes)
2. A simple fan curve in addition to the temperature watchdog (i.e. --temp_max=83,100 -> max temp of 83 while fan must be at 100% and --temp_ramp=75,70 -> when it reaches temp of 75, spin fan to 70% and increase to linearly until max defined temp is reached)
The 18.12 Adrenaline drivers have Fan Curve. But since you are on Linux, that wont help you

Yeah, on Linux a fan control mechanism using interpolation between defined points on a fan/temp curve could be a nice addition, not too much work either, it's simple scaling the fan pct using sysfs. For Windows it's getting messier now, I haven't looked into how ADL interacts with the 18.12.2 fan curve at all yet.
For the first question, the uptime for all gpus is the same, so it's logged every time the hashrates are logged (every 30 secs or what you set the log interval to). We can add it to the log output made by the watchdog when a gpu is detected dead though, it's trivial really. Personally, I don't see the big value add though

.