This line means there is a problem with the bios (rom) on one of the GPUs:
WARNING: infoROM is corrupted at gpu 0000:07:00.0
I would return this GPU or RMA it.
You could try re flashing its rom with NVFlash; but if this doesn't work it will most likely void your warranty; so if the GPUs are new I would go the other route.
For fan speed, try setting:
SLOW_USB_KEY_MODE="YES"
let me know if that works.
Also what kind of USB / SSD are you using?
Heya, thanks for the reply.
About to return the GPU, it's brand new bought couple of days ago. Not going to reflash it or anything not to void warranty, thanks for the tip.
About the fan speed.
m1@rig1:~$ export DISPLAY=
m1@rig1:~$ echo $DISPLAY
m1@rig1:~$ nvidia-settings -a [fan:0]/GPUTargetFanSpeed=75
Failed to connect to Mir: Failed to connect to server socket: No such file or directory
Unable to init server: Could not connect: Connection refused
ERROR: The control display is undefined; please run `nvidia-settings --help` for usage information.
m1@rig1:~$ cat Desktop/oneBash | grep 'SLOW_USB_KEY_MODE='
SLOW_USB_KEY_MODE="YES" # YES NO
m1@rig1:~$ export DISPLAY=:0.0
m1@rig1:~$ xrandr
xrandr: Failed to get size of gamma for output default
Screen 0: minimum 1024 x 768, current 1024 x 768, maximum 1024 x 768
default connected 1024x768+0+0 0mm x 0mm
1024x768 0.00*
m1@rig1:~$ echo $DISPLAY
:0.0
m1@rig1:~$ nvidia-settings -a [fan:0]/GPUTargetFanSpeed=75
** (nvidia-settings:5815): WARNING **: Couldn't register with accessibility bus: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
ERROR: Error querying enabled displays on GPU 0 (Missing Extension).
ERROR: Error querying connected displays on GPU 0 (Missing Extension).
ERROR: Error resolving target specification 'fan:0' (No targets match target specification), specified in assignment '[fan:0]/GPUTargetFanSpeed=75'.
xorg.conf
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 378.13 (buildmeister@swio-display-x86-rhel47-05) Tue Feb 7 19:37:00 PST 2017
Section "ServerLayout"
Identifier "layout"
Screen 0 "nvidia" 0 0
Inactive "intel"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection
Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "keyboard"
EndSection
Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection
Section "Monitor"
Identifier "Monitor0"
VendorName "Unknown"
ModelName "Unknown"
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option "DPMS"
EndSection
Section "Device"
Identifier "intel"
Driver "modesetting"
Option "AccelMethod" "None"
BusID "PCI:0@0:2:0"
EndSection
Section "Device"
Identifier "nvidia"
Driver "nvidia"
BusID "PCI:1@0:0:0"
EndSection
Section "Device"
Identifier "nvidia"
Driver "nvidia"
Option "ConstrainCursor" "off"
BusID "PCI:4@0:0:0"
EndSection
Section "Device"
Identifier "nvidia"
Driver "nvidia"
Option "ConstrainCursor" "off"
BusID "PCI:7@0:0:0"
EndSection
Section "Device"
Identifier "nvidia"
Driver "nvidia"
Option "ConstrainCursor" "off"
BusID "PCI:8@0:0:0"
EndSection
Section "Device"
Identifier "nvidia"
Driver "nvidia"
Option "ConstrainCursor" "off"
BusID "PCI:10@0:0:0"
EndSection
Section "Screen"
Identifier "intel"
Device "intel"
Monitor "Monitor0"
EndSection
Section "Screen"
Identifier "nvidia"
Device "nvidia"
Monitor "Monitor0"
DefaultDepth 24
Option "AllowEmptyInitialConfiguration" "on"
Option "IgnoreDisplayDevices" "CRT"
Option "ConstrainCursor" "off"
Option "Coolbits" "24"
SubSection "Display"
Depth 24
Modes "nvidia-auto-select"
EndSubSection
EndSection
Section "Screen"
Identifier "nvidia"
Device "nvidia"
Monitor "Monitor0"
Option "AllowEmptyInitialConfiguration" "on"
Option "IgnoreDisplayDevices" "CRT"
EndSection
Section "Screen"
Identifier "nvidia"
Device "nvidia"
Monitor "Monitor0"
Option "AllowEmptyInitialConfiguration" "on"
Option "IgnoreDisplayDevices" "CRT"
EndSection
Section "Screen"
Identifier "nvidia"
Device "nvidia"
Monitor "Monitor0"
Option "AllowEmptyInitialConfiguration" "on"
Option "IgnoreDisplayDevices" "CRT"
EndSection
Section "Screen"
Identifier "nvidia"
Device "nvidia"
Monitor "Monitor0"
Option "AllowEmptyInitialConfiguration" "on"
Option "IgnoreDisplayDevices" "CRT"
EndSection
Sandisk SSD 120GB, used dd to write the img to disk. Access to rigs only possible via SSH, no TV, no RDP (maybe VGA/HDMI if required).
When having a few rigs, easier to identify them like this than by IP (atleast in my case).
# hostname rig1
# echo "rig1" > /etc/hostname
# sed -i 's/m1-desktop/rig1/g' /etc/hosts
then in oneBash
XXX_WORKER="$HOSTNAME"
Thanks for the help, i'll keep trying to fix the fanspeed thing