Post
Topic
Board Mining (Altcoins)
Re: [OS] nvOC easy-to-use Linux Nvidia Mining v0019
by
Doftorul
on 11/09/2017, 21:50:32 UTC
Hi guys, can someone give some hints or explain what is wrong with my rigs since only the gpu0 can have can receive/execute the fan speed as being set ? Below is the error i see all the time when any other than gpu0 fan gets set:

ERROR: Error assigning value 50 to attribute
       'GPUTargetFanSpeed' (m1-desktop:0[fan:1])
       as specified in assignment
       '[fan:1]/GPUTargetFanSpeed=50' (Unknown
       Error).

I am currently using rigs with 3 x 1080 Ti and 3 x 1070's and having the first release of v019 installed on SSD.
Tried manually to do:

nvidia-xconfig --enable-all-gpus
nvidia-xconfig --cool-bits=4
nvidia-settings -a [gpu:1]/GPUFanControlState=1
nvidia-settings -a [fan:1]/GPUTargetFanSpeed=50

but the error is the same.
Other than being unable to set the fan speeds, the rigs are chugging along, however, the speed of the fans for the cards other than gpu0 is the factory default as neither the minimum fan speed set in 1bash gets applied, with or without the automated fan control feature.

Edit: in the nvidia x server settings i can see the Enable fan control feature for the gpu0 but it is missing for the other gpu's.

Does the error happens if you set speed to higher values like 60-65 too ?
I think I had the same problem on low values

Thank you for your answer !

Yes, the problem persists even if i set higher fan values or 100%...

In the Thermal Settings in the nvidia x settings utility only gpu0 has fan control option enabled, the other gpu's are missing this feature.
Then may be some things wrong with the image/OS
Some times starting from scratch is easier than trying to solve the problem.

P.S. is GPU Power Mizer Mode set in 1bash?

Code:
GPUPowerMizerMode_Adjust="YES"

The PowerMizer setting makes no difference.
However, i think there is something odd here. When i run manually
m1@m1-desktop:~$ lspci |grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b81 (rev a1)
20:00.0 VGA compatible controller: NVIDIA Corporation Device 1b81 (rev a1)
30:00.0 VGA compatible controller: NVIDIA Corporation Device 1b81 (rev a1)

Then in the xorg.conf generated in an attempt to debug the issue i have:

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 1070"
    BusID          "PCI:1:0:0"
EndSection

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 1070"
    BusID          "PCI:32:0:0"
EndSection

Section "Device"
    Identifier     "Device2"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 1070"
    BusID          "PCI:48:0:0"
EndSection


Shouldn't there be some sort of relation between what the lspci command lists and the Xorg.conf device id ?

I think its better not to waste your time on finding the solution to the problem, as it should be none
start from scratch.

Set your bios, connect one gpu, boot,copy your 1bash (there is a bug that wont copy it from temp partition), reboot, check if every thing is ok.
shutdown, connect rest of the gpu, restart, while first one is still connected, it may reboot with xorg error.
After restart all should be ok.

Well... i can build a complete rig for less than usd50 using older hp compaq business pc's like dc7800 or dc7900, everything works ok with dc7900, this issue seems to be related to the dc7800.
The dc7800 has 3 pciex slots as its biger sibling dc7900, the only difference being the chipset: dc7900 has a q45 chipset while dc7800 has a q35 chipset.
I'll try tomorrow with a dc7900 using the same gpu cards to see if it works.

These older motherboards might have glitches with the latest ubuntu...

EDIT: SOLUTION:

In the /etc/X11/xorg.conf in the Devices section there should be a
Screen         0
added right before the EndSection. That fixes it. Apparently if the busId of the VGA is >16 nvOc doesn't attach the screen to the cards and hence there is no power nor fan control enabled for the cards.
Tested the fix with old motherboards using Q35 and Q45 chipsets.