Post
Topic
Board Mining (Altcoins)
Re: [OS] nvOC easy-to-use Linux Nvidia Mining
by
lbrasi
on 23/06/2017, 01:57:21 UTC
Hi All,

I have decided to give nvOC a go, I am using an older Gigabyte GA-EX58-UD4P (Socket LGA1366) board and an i7 920 CPU.  I have two EVGA GTX 1080ti on risers in both PCI-E x16 slots.
I am able to boot fine but things just are not stable, I can run for about 15 hours but later GPU0 drops to about 150-170w and SOLs are suffering, I can reboot and get it working but it seems to happen again.

I have no idea if any BIOS changes need to be made to get this running stable or not, does anyone have any idea what I can try?

Also if I kill the mining process and make changes to the onebash config it doesn't seem to accept the changes, more specially the fan speed.  I attempt to launch the miner again and get the below error but the process does start;


Failed to connect to Mir: Failed to connect to server socket: No such file or directory
Unable to init server: Could not connect: Connection refused

ERROR: The control display is undefined; please run `nvidia-settings --help`
       for usage information.


When the mining process starts do you see OC messages like this?

https://ip.bitcointalk.org/?u=https%3A%2F%2Fs13.postimg.org%2Fq08huqnyv%2FIMG_0270.jpg%26t%3D577%26c%3Dyq0szP4ICxv47w&t=577&c=q_tbQ_FaeljjYQ

If you don't:

At any point did you boot with the monitor connected to the motherboard?

Did you at anytime boot with only one GPU attached?

If either of these is the case: ensure the monitor is attached to the primary GPU ( the one connected to the 16x slot closest to the CPU )

then follow this process:

https://bitcointalk.org/index.php?topic=1854250.msg19449945#msg19449945

I only booted with a monitor connected the first time, now it is completely headless.
Yes I might have booted with only one GPU attached at one point.
Thank you I will follow that process.  If I re-image the USB key and boot completely headless should I be seeing the OC messages via SSH as well?

With v0015 you would have to enable openssh server before you could SSH in.  You would also have to be able to know the rigs IP; which can be done in several ways.

With v0016 you can enable openssh server in oneBash, I would recommending trying it and seeing if it solves the problem.

I have now also tested with v0016, the mining process starts fine with all OC settings when viewing directly on the rig.  However I am still getting the error via SSH when trying to start.

Failed to connect to Mir: Failed to connect to server socket: No such file or directory
Unable to init server: Could not connect: Connection refused

ERROR: The control display is undefined; please run `nvidia-settings --help`
       for usage information.

What type of OS is the client computer; and what are the IPs of the client and rig?

Windows 10 running putty, Rig: 192.168.1.19 Client: 192.168.1.6

So when you enter

m1@192.168.1.19   

using port 22

with SSH selected

and click open it gives you the error above?


Sorry let me try and clear some things up.  I am able to SSH into the rig just fine but when executing the miner via ssh, this is when i am getting the error, so clock settings, fan speed and power limits are not being set but the mining process still starts.

I also really appreciate your support and rapid response.

Are you killing the existing mining process before launching another via SSH?

Also; when you are launching the mining process via SSH are you using the cmd:

Code:
bash '/media/m1/1263-A96E/oneBash'

I am SSHing and executing the following commands.

ps aux | grep gnome-terminal - finding the gnome-server PID
kill PID
screen -S rig1
bash '/media/m1/1263-A96E/oneBash'

Just before the miner starts I see the above error in place of the "attribute" commands but the mining process still starts.


after you have SSHed in enter the cmd:

Code:
echo $DISPLAY

and tell me what it outputs

That does not output anything at all.

Ok, tomorrow I will try to replicate this error; and see if I can figure out what is happening.  My guess is X11 is having a problem trying to output graphically.

If you have a linux computer other than your rig and you SSH into the rig with it; does it have the same error?

I tested from a linux VM I have and the same result Sad


I tested this today and found there is a new problem resulting (most likely) from my adding support for up to 14 GPUs. 

I found that if I waited the screen would still connect to the mining process after failing to connect 3 or 4 times.

If you are using linux
you can resolve this error by adding the following argument when SSHing into the rig:

Code:
-X

so that from a terminal I would enter:

Code:
ssh m1@rigipaddress -X

so for a rig with an ip of 192.168.1.22. I would use:

Code:
ssh m1@192.168.1.22 -X

If you are using putty with windows the -X will not work; but after showing the error 3 or 4  times the screen should still connect to the mining process


I also tested using the -dmS argument when calling screen.

if you call screen with:
Code:
screen -dmS rig1

screen with start as a background process (so you can disconnect your ssh session and the miner will continue to mine on its own)

after starting screen you will need to connect to the screen (as it is running in the background)

you do this by entering:

Code:
screen -r

you can close the ssh session whenever and then reSSH in and enter:

Code:
screen -r

to return to the mining process whenever desired.



Tested with -X on the SSH command from my linux VM and now I am getting the below;

ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:0' (No targets match target
       specification), specified in assignment
       '[gpu:0]/GPUGraphicsClockOffset[3]=0'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:0' (No targets match target
       specification), specified in assignment
       '[gpu:0]/GPUMemoryTransferRateOffset[3]=0'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:1' (No targets match target
       specification), specified in assignment
       '[gpu:1]/GPUGraphicsClockOffset[3]=50'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:1' (No targets match target
       specification), specified in assignment
       '[gpu:1]/GPUMemoryTransferRateOffset[3]=50'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:0' (No targets match target
       specification), specified in assignment '[gpu:0]/GPUFanControlState=1'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'fan:0' (No targets match target
       specification), specified in assignment '[fan:0]/GPUTargetFanSpeed=65'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:1' (No targets match target
       specification), specified in assignment '[gpu:1]/GPUFanControlState=1'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'fan:1' (No targets match target
       specification), specified in assignment '[fan:1]/GPUTargetFanSpeed=65'.


When you enter:

Code:
lspci | grep VGA

what is the output?


01:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)


try this

on the rig open a guake terminal and enter:

gksu gedit '/etc/X11/xorg.conf'

then select all and delete:

replace with this:

Code:
Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    Screen      1  "Screen1" 1920 0
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
    Option         "Xinerama" "0"
EndSection

Section "Files"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "CLB  fit Headless"
    HorizSync       30.0 - 83.0
    VertRefresh     56.0 - 76.0
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    VendorName     "Unknown"
    ModelName      "CLB  fit Headless"
    HorizSync       30.0 - 83.0
    VertRefresh     56.0 - 76.0
EndSection


Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 1080 Ti"
    BusID          "PCI:01:00:0"
EndSection

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 1080 Ti"
    BusID          "PCI:02:00:0"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    Option         "Coolbits" "28"
    Option         "Stereo" "0"
    Option         "nvidiaXineramaInfoOrder" "DFP-1"
    Option         "metamodes" "nvidia-auto-select +0+0"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen1"
    Device         "Device1"
    Monitor        "Monitor1"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    Option         "Coolbits" "28"
    Option         "Stereo" "0"
    Option         "nvidiaXineramaInfoOrder" "DFP-1"
    Option         "metamodes" "nvidia-auto-select +0+0"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

save

then logout

login

see if this solved the problem.

Not sure if this will work; but its worth a try

Replaced the xorg.conf with what you provided.

ssh without -X
Code:
Failed to connect to Mir: Failed to connect to server socket: No such file or directory
Unable to init server: Could not connect: Connection refused

ERROR: The control display is undefined; please run `nvidia-settings --help`
       for usage information.

ssh with -X

Code:
ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:0' (No targets match target
       specification), specified in assignment
       '[gpu:0]/GPUGraphicsClockOffset[3]=0'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:0' (No targets match target
       specification), specified in assignment
       '[gpu:0]/GPUMemoryTransferRateOffset[3]=0'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:1' (No targets match target
       specification), specified in assignment
       '[gpu:1]/GPUGraphicsClockOffset[3]=50'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:1' (No targets match target
       specification), specified in assignment
       '[gpu:1]/GPUMemoryTransferRateOffset[3]=50'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:0' (No targets match target
       specification), specified in assignment '[gpu:0]/GPUFanControlState=1'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'fan:0' (No targets match target
       specification), specified in assignment '[fan:0]/GPUTargetFanSpeed=65'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'gpu:1' (No targets match target
       specification), specified in assignment '[gpu:1]/GPUFanControlState=1'.


ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).



ERROR: Error resolving target specification 'fan:1' (No targets match target
       specification), specified in assignment '[fan:1]/GPUTargetFanSpeed=65'.