Post
Topic
Board Mining (Altcoins)
Re: Claymore's Dual Ethereum AMD+NVIDIA GPU Miner v11.7 (Windows/Linux)
by
iSuX
on 23/05/2018, 07:59:24 UTC
Built a new rig and my hashrate is down...

Old rig:
MB: ASROCK H81 Pro BTC 2.0
CPU: Intel Pentium G3260 (3.3GHz)
RAM: 8GB - 1333MHz
6GPU

New rig:
MB: Asus B250 Mining Expert
CPU: Intel Celeron G3930 (2,9Ghz)
RAM: 8GB - 2400MHz
updated Bios + Chipset to latest version
9GPU

Dual mining ETH+XVG with Claymore 11.7
Both times Radeon drivers 18.3.1, all compute.
4 of my cards that used to mine ETH at 30Mh/s and now do only 28,3 in my new rig... Those cards are on the same PSU I used in the old rig.

OC/UV used to be 1130 / 2035 / 890 / 890
Tried pushing them to 1130 / 2100 / 880 / 880, but I get minimal gain and seem to have touched the ceiling as I start getting invalid shares and have to get back to safer levels.

This might seem silly, but is it possible I have to flash the BIOS of the GPUs again as the configuration is different? Latest Windows10, so problems with ATIFlash to check... :-(

Thanks for any help


Hey there
"Latest version" is not very informative, such things have numbers for good reason.
Now, that said, there isn't enough data here for a differential analysis, nor any explanation as to why you are thinking, OC or BIOS as candidates.

Assuming you didn't change either of those (OC/Bios) when you migrated your GPUs, from old rig to new rig, (did you?, as you mention 6 GPUs in old rig, and 9 in the new rig), so are 6 of those GPUs moved from old to new rig, and now seem to be hashing slower?

What make/model GPU?

Also, what is far more relevant here is, new motherboard, new ram, new cpu, new OS, new install of everything, and assuming you didn't change BIOS or Claymore config, I'd strongly suggest investigating all the former, before turning the spotlight on the latter.

Some things to consider.

What OS version on both rigs?
IME Win10 Home version out performs Pro in almost every respect, gaming or mining. For one, it's loading a LOT less MS-crap that most people don't need. (If you're not joining a domain, you probably don't need Pro). The number of services is nearly double on Pro, and again, probably you don't need any of the pro stuff for mining.
Mining and stability go hand in hand, and keeping the OS as lean as possible, updates disabled, (set all NICs to metered connection), and manually control/manage how/when you plan on updating, (if at all).

FYI, I also have one rig on the B250ME, and have no issues with different hash rates given same GPU/driver/claymore ver. (I also moved GPUs to that rig, and from it, as I was able to consolidate same make/model/ cards on it) (mixed mem type still, and no issues).
This has been so since day 1 with the B250.
Where the B250 did become a pig, was getting it stable with the 12th GPU, and worse still with the 13th.
That said, spot check, clocking through 75 hours, (claymore 11.7, Win10-64Bit-Radeon-Software-Adrenalin-Edition-18.3.4-March23, Win10Pro 10.0.14393) which is not a record, but I had to reboot last Sat), but the B250 is running well these last 4~5 weeks.
(13x RX580) Sapphire, mix of Hynix/Samsung/Micron2, all on custom BIOS, all hashing >32.

But as you asked, I HAVE seen some issues with ATIWinflash.exe, I've certainly taken some shortcuts, and regretted it. If nothing else, shortcuts add an element of doubt when analysing issues later on, and in some situations, you're unable to quantify those, (doubt remains).

When working with networked systems, by "networked" I mean, multiple computers, (gpus), in a single entity, it's paramount to be systematic, anal always helps :-)

Some tips.
Note the serial numbers of your GPUs, backup the BIOS before you do anything, (use the serial number in the bios file names).
Start off with stock BIOS, (for some days), so you establish a reasonable baseline of data.
GATHER data, write it down. What is "normal" for each etc
Make sure you know which GPU number (in Claymore, in GPU-z, in Radeon-settings), is which, and that way, you know what you are changing, (and can revert), and most important of all, DIFFERENTIATE when you see issues.

I've seen stock BIOS on RX580s, as low as 18MH/s and typically no higher than 29MH/s.

If you mod the BIOS, work on only ONE GPU at a time, version your BIOS edits, (CardMakeModel_SerialNumber_Ver) or something to that affect.
Key point is, make SURE it's clearly identifiable.

Get yourself a METHOD, you are going to end up managing and maintaining a complex system, there are a LOT of variables, so with a standard method, you can at least hope to rule out all the dumb human elements.

Easier said than done, and one should consider "method" a work in progress, refine that as you go along, and maintaining rigs DOES get easier with experience, and systematic analysis.

One thing about fine tuning BIOs: Initially I was struggling to break 30MH/s and made some mistakes,  identified dead ends, had some cards that were real pigs, but actually turned out far better than I expected, even on more than one occasion considered returning.

Try to use all the same make/model/mem-type of card if you plan on a small operation, (<10 GPUs).
I don't say this to cherry pick, or exclude any vendor, but simply because with a small number of cards, it's very hard to know what ""normal" is for them. If you have only 1x Asus RX580, how do you know if it's good/bad/ugly? If you have 2 or more, you can at least hope to make some comparisons.

Also, don't believe everything you read online, (and feel free to ignore me :-) but MAN I read a lot of utter RUBBISH online. In fact, finding the accurate data is the toughest challenge.

One example, that still puzzles me, (note: puzzles me, not saying it's rubbish), is how many people rave about Samsung memory. In my experience, Samsung has been the best out-of-the-box GDDR5, but also the toughest to fine tune. In fact, I'd suggest it's not even worth trying memory strap edits with it.
Drop the power consumption & core clock, up the memory to 2250, and you should be seeing 32.5, and stable. I wasted a lot of time trying to better that, only because I didn't see a big jump, and due to hype, assumed Samsung was the fastest.
By the time I got my hands on my 2 and only Samsung GPUs, I had Micron running super stable at 32.6, up from 29~30 stock, and (mistakenly) figured I could see similar gains with Samsung.

My present conclusion is, Samsung mem is already pretty tightly dialled in, so as an out of the box card, it's certainly the best I have seen, but it certainly is not the top performer in my rigs.

Some notes on BIOS editing.
Make sure you have a versioned stock bios file saved.
Use THAT file for editing. (Do not use BIOS from the net).
Depending on the card and editor, it's highly likely there are other bios elements that are not displayed or editable in the bios editor. Who knows what you change if you flash some random BIOS from the net.
Make sure you have only a single card connected when you flash.
(For SURE, I have done a enmasse deploy to 3 GPUs on multiple occasions, and in recent weeks identified that as a 100% confirmed issue). 2 different rigs, 1~2 cards started to show got incorrect share, after some hours.
What was weird was, those were all cards that were mass-flashed.
What solved that?

Several things.

Shut down rig, bleed down power, remove all but 1 GPU (usb cable), boot up, flash that one card, (I used the same BIOS ver for that same card as last flash), do NOT click ok to the "you must reboot message", but actually shut down, and bleed down the power.
I have a suspicion the dual BIOs cards are capable of holding the BIOS in DRAM and not actually booting to the newly flashed BIOS.
Sure, that will depend on the architecture of the card, but BIOS is typically EERAM, or some similar flash ram type chip.
The card is NOT going to use this (new program), unless several things happen.
A flag/trip is set, (by the flash app), to reload BIOS, (this seems to fail sometimes with Winflash), but for sure, what will force the card to load the new BIOS on next boot, is removal of power, (bleed-down), as this clears the DRAM, and leave the card no choice but to load from EERAM.

Did you notice how slow it is to program BIOS? Those are usually 256 or 512 KB chips. That is KB, KB, nothing these days, yet flashing takes nearly 1 minute for 512 chips. Sure the write speed is slower, but even the verify is slow, and this is because the eeram is not intended to be fast, or used much, (written to), but is depended on to safely store data, without any powered backup.
What typically happens is, as soon as power comes up, the dram content is crc checked with eeram, if ok, card boots, if not, it loads from eeram. As this happens at power on, before os init, that causes no delay in booting. What the winflash does is cause change in crc, and that SHOULD trigger the card to load the bios from eeram on init, but this has certainly failed for me.

I saw no change in card performance after flashing. Stopped claymore, opened winflash, saved BIOS, and sure enough it WAS new, so flash was ok. I shut down, bled-down power, rebooted, and THEN I saw the improvement I was expecting.

My point is, that winflash is only able to read the bios from the eeram, NOT from dram.

Basically you can think of it much like your computer, eeram=hdd and dram=system memory.
This architecture is not so by accident.

The purest will also say you should not use winflash, but rather boot to a very light os, (Dos), and flash from there. This is actually very good advice, but I decided I'd use that as the fall back, and at least start out with WinFlash, as it IS vendor provided after all.


But for sure, it has it's use case, and multiple flashing is a time-saver I've decided is not worth the gain.

But at the same time, I've yet to see those issues insurmountable.

OK, last work on OC.
OC is pushing your luck, if you see issues, remember, you are pushing your luck.
As for the silicone lottery, at least with Sapphire cards, having built rigs for a few people now, I've only had 1 card out of 30 something, that was clearly a dud.
BUT the only way I can say that with certainly is because I documented all the 30+ other cards, (same make/model/memory), and no matter what I did to that one card, it was not stable with any oc at all.
Is that the silicone lottery? I don't think so, I think it was simply a QC issue.
I could be wrong, maybe someone with 100+ cards, data, and experience can chime in here.



OK< last thing.

Did you try setting the dcri?

New rig has 9 GPUs, cmiiw.

DCRI will be different for sure.
Disable dual mining for a start, (pointless, no profit unless you have free power, and even then dubious imho).
Try the "z" key during solo mining, and let claymore find the optimum.
If those numbers are very different from your dcri value, (>2) then I'd suggest at least picking the mean for your config file.

What happened to your hash rate after "z"?

Side note, and no disrespect to Mr Claymore, but usually I find a better dcri value, by using the -+ keys, and set same for all gpu.
But for sure, the z key is a nice feature and great tool to get you 95% optimised and quickly.

Also, set -y 1 check your console during init and make sure you see both lines below.
 
All AMD cards use Compute Mode already
CrossFire is disabled already

 
Otherwise, open admin console, start your batch file as usual, wait for the "all cards now use computer, please reboot"
Reboot, and run claymore as normal, (no need for admin).

THIS IS THE BEST FEATURE from Claymore of late, if you have more than 5GPU, otherwise it's hours of lame AMD-settings, and reboots, to configure each GPU.

Finally, if still no luck, post your batch file, and some specifics on hardware/versions etc.

Good luck.


Thank you for so much good info.
(In the meantime my rig is down to 8 GPUs, sold one together with the old MB)
I tried working on the dcri value, hash got a little bit better, but I am stuck at the following values:

ETH: GPU0 29.938 Mh/s, GPU1 30.101 Mh/s, GPU2 30.273 Mh/s, GPU3 30.097 Mh/s, GPU4 30.433 Mh/s, GPU5 28.517 Mh/s, GPU6 28.524 Mh/s, GPU7 28.521 Mh/s

The last three GPUs reached 30 without too much overclocking on the old rig. I push the last 3 GPUs until instability (as you can see in my batfile), but it doesn't get any higher than 28.5

This is my current batfile:
EthDcrMiner64.exe -tt 69 -fanmin 55 -y 1 -cclock 1130,1130,1130,1130,1130,1030,1030,1030 -mclock 2065,2035,2045,2035,2050,2100,2100,2100 -cvddc 886,890,889,890,885,888,888,888 -mvddc 885,890,889,890,885,888,888,888 -epool stratum+ssl://eu1.ethermine.org:5555 -ewal WALLET -epsw x -estale 0
pause

GPU0,1,2,5,6,7 are MSI Radeon RX580 Armor OC 8GB

GPU0,1,2,3 are on circuit A of the MB - PSU Corsair HX1000)
GPU4,5,6,7 are on circuit B (tried C as well) - PSU LC-Power Platinum 1000W. Tried them first with the same BeQuiet 1000W PSU I used on my previous rig, where they reached 30Mh/s but even with this same PSU it was 28.5 on the new rig/MB.
Bios of all cards flashed with their separate one-click-patch Polarisbios

Motherboard Asus B250 ME - Bios 1010 (29-3-2018)

Windows 10 Pro - 1709 build 16299.431

Any more tips?
Thanks

No worries.
OK, what was immediately jumping out at me here, is how highly tuned your config file is.
In comparison, I run all my rigs with identical settings for all GPUs, no card specific clock, voltage settings, (except 1 single card, an ASUS Dual, (a BAD choice I have come to realise, really REALLY shitty fans, 1 failed after 2 weeks, the replacement card failed 2 days after amazon 30day returns, anyway, I won't bore you with that story, but I down clocked it to keep it cooler).

What I wonder here is why, with 6 identical GPU, do you have them dialled in so tightly?
For sure that is accounting for the differences you see between those 6, do they have different memory?

FYI, I've found Micron to be the best, (MT51J256M3 (MICRON)) also the one you'll probably read a lot of people moaning about too, and I confess, WAS a bit of a challenge, but also a learning curve for me.
Samsung is very good, imho the best I've seen, out-of-the-box for mining,.
Hynix and Hynix2, seem to be a bit petulant, and I probably need to do more work on refining those, but
My point is, regardless of memory, I have no RX580 8GB cards running below 32MH/s, all have identical clocks and voltages, regardless of vendor.

What I would say is, with your current config file, it's impossible to make any comparisons, and therefore any differential analysis is pretty pointless as it is.


Also, your batch file is kind of a shambles :-) I don't say this as a criticism, but it makes it easy to make mistakes or miss seeing them, and I'd suggest you reorder that file, as per claymore examples.

Also, I see some odd choices in there, why are you stratum mining SSL at Ethermine?

Try this config, assumes you have populated your epools.txt file, you SHOULD get min speed 256, but I set this to 200 for now, otherwise you will get 5 min restarts below that level etc

Make sure you have virtual memory to 48000, (should be enough, I have that on a 13GPU rig, with 16GB ram, no issues)


setx GPU_FORCE_64BIT_PTR 1
setx GPU_MAX_HEAP_SIZE 100
setx GPU_USE_SYNC_OBJECTS 1
setx GPU_MAX_ALLOC_PERCENT 100
setx GPU_SINGLE_ALLOC_PERCENT 100
EthDcrMiner64.exe -epool ssl://eu1.ethermine.org:5555 -ewal WALLET.RIG -epsw x -checkcert 1 -epoolsfile epools.txt -minspeed 200 -gser 1 -esm 0 -etha 0 -ethi 16 -eres 2 -erate 1 -estale 1 -asm 1 -platform 1 -y 1 -dcri 9 -wd 1 -ftime 5 -r 14400 -cclock 1200 -cvddc 900 -mclock 2250 -mvddc 850 -tstop 83 -tstart 50 -tt 69 -fanmin 55 -fanmax 100 -ttdcr 80 -ttli 80 -mode 1 -dbg 0 -altnum 3 -mport -3333 -mpsw whatever -logfile logs\

Now, with your cards, claymore might not init with -mclock 2250, so if that happens, change that to 2100, (stock for RX580), and see what happens.

If you need more help, please list each GPU number, make model, memory, and some logfile showing hash rate, it might be possible to spot something when every card is running the same settings, (or even stock settings for all is a good idea in this situation, as the data should be solid for analysis)

Also, you could try running with the -di 0 parameter, (GPU0 only), and tuning that up, (probably dcri 4~6 will give better rates with a single GPU, but you will have to experiment on your own rig).

To be clear, if I were approaching this situation, my working logic would be to break this down, it has 8 GPU, pick the worst performer, remove the others, and get that one GPU to 32MH/s something, by the time you get there, you should have the experience/know-how to apply to the others.
If you have only a single rig, you can also run another instance of claymore with the parameter, -di 1234567 which will allow you to at least mine with the other cards while you focus on pumping up GPU0 to 32MH/s

I'd also be interested to see a screen shot of gpu-z and polarisBios editor for GPU0, or GPU7.

But for sure, something is very wrong there, (or those MSI cards are rubbish, (and I don't think I recall hearing they are, in fact a lot of miners use them, but I have none myself), but what I can say is, while I tried to stick with Sapphire, during late 2017 and Q1 2018, with shortages as they were, I ended up with some ASUS DUAL, (bad choice), Gigabyte Gaming, Aurous, (a bit tricky to bios edit initially), but regardless all are stable at 32MH/s, so your 28s look really painful man.

Still, on the plus side , you have a nice gain of 28MH/s to aim for right now, and assuming those MSI cards are up for it, that should be achievable.
Good luck man.

Foot note: Thinking a bit more here, I wonder if there is a correlation with 28.5-ish rates and BIOS. There might be no connection of course, but it's interesting that 28-ish is usually what I see with out of the box BIOS. Then again, your clocks are all over the place, so hard to know. Worth following up on, if only to be discounted.