I'm in the middle of reorganizing things. Master branch is now up to date with Pavel Bylica's work and David Li speedup. Releases are still there under 117 branch. Apologies for the mess, didnt know people still used the binaries.
So you mean ethminer is not compatible with eth-proxy stratum?
Is there anything I can do or we have to wait for genoil to fix it?
i don't think he wanna fix it, cause was report few months ago, you can try to fix it by your self
they must have changed something at some point because it used to work fine.
my inbox gets flooded these days with people trying to build or use my miner, which is cool but i thought everybody had kind of moved on. then i checked ETH price and it all started to make sense
for some reason this nanopool doesn't send my client work, so it doesn't even start building DAG, even though it gets authorized...
Claymore, do you have any ideas how to overcome Polaris hashrate drop for epoch >129 ? Genoil said that DAG now took 100% of first memory bank (2gb) and go to second memory bank, so access two banks instead of one leads significant hashrate drop. Do you make any research in this direction, or it is hopeless? What you think, any comments?
I did not do any research yet, but I will do it very soon.
disclaimer: i just gave this as a possibly oversimplified answer to the question. like you explain to kids that thunder comes from clouds bumping into eachother
This has nothing to do with the diff bomb. This has to do with the hardware architecture of the cards. The reason it started now is because DAG size is over 2GB. In a simplified manner, you could see it as part of DAG now physically ending up on a different IC. This causes the memory controller to switch the IC it reads from more often. The larger the DAG, the bigger the part of the DAG that ends up on a different IC, the more switching it has to do.
Hashrate Drop Every New DAG with RX480 8G this tack effect on last 3 dag ago
what can I do ??
This is most likely a hardware related issue caused by the DAG being over 2GB now. This causes something like the DAG now being spread over multiple banks of memory, slowing down access. The bigger the DAG gets, the more often the memory controller has to switch banks, the slower it responds. In reality it probably is a little bit different/nuanced, but I guess it comes down to the same thing. The most important question is; how much are the 290/390's affected? Because when they have it too, the majority is in the same sinking ship, so it's not that big of a deal. Good news for Pascal owners though, they should be fine as a similar issue was fixed when that hit the market last year.
I repeate I use RX 480 8G
you can't do anything. just calculate your losses or sell the damn thing right away. they go like hot cakes these days.
Hashrate Drop Every New DAG with RX480 8G this tack effect on last 3 dag ago
what can I do ??
This is most likely a hardware related issue caused by the DAG being over 2GB now. This causes something like the DAG now being spread over multiple banks of memory, slowing down access. The bigger the DAG gets, the more often the memory controller has to switch banks, the slower it responds. In reality it probably is a little bit different/nuanced, but I guess it comes down to the same thing. The most important question is; how much are the 290/390's affected? Because when they have it too, the majority is in the same sinking ship, so it's not that big of a deal. Good news for Pascal owners though, they should be fine as a similar issue was fixed when that hit the market last year.
it is true bensam. i tried on my old wonky miner and it shows the same drop. dunno if it's exactly the same issue, afaik AMD arch doesn't even have a TLB. now let's see how that 1070 holds up
Hard to say as I am quite ambitious with Ethash. We got a permission from Genoil to use his kernel, and it's already running faster by 15% after a few optimizations on RX 470. We will see.
You mean faster than genoil , not claymore right?
Well, I would say you do the math.
15% is a stellar increase for ethash! awesome job!
@Genoil, if I understand correctly I can not generate DAG file locally (saved on my drive) with 1.1.x version of ethminer? I noticed that if I use latest 1.0.x windows build, I can only generate DAGs up to block 3840000 (-D,--create-dag ) ... if I specify any larger block, ethminer crashes. Is this an issue with Ethereum generally, or an issue with ethminer?
I don't know. I'm not actively developing my ethminer fork any longer.
While my initial analysis was focused on the external GDDR5 bandwidth limits, current ZEC GPU mining software seems to be limited by the memory controller/core bus. On AMD GCN, each memory controller can xfer 64 bytes (1 cache line) per clock. In SA5, the ht_store function, in addition to adding to row counters, does 4 separate memory writes for most rounds (3 writes for the last couple rounds). All of these writes are either 4 or 8 bytes, so much less than 64 bytes per clock are being transferred to the L2 cache. A single thread (1 SIMD element) can transfer at most 16 bytes (dwordX4) in a single instruction. This means a modified ht_store thread could update a row slot in 2 clocks. If the update operation is split between 2 (or 4 or more) threads, one slot can be updated in one clock, since 2 threads can simultaneously write to different parts of the same 64-byte block. This would mean each row update operation could be done in 2 GPU core clock cycles; one for the counter update, and one for updating the row slot.
Even with those changes, my calculations indicate that a ZEC miner would be limited by the core clock, according to a ratio of approximately 5:6. In other words, when a Rx 470 has a memory clock of 1750Mhz, the core would need to be clocked at 1750 * 5/6 = 1458Mhz in order to achieve maximum performance.
If the row counters can be kept in LDS or GDS, the core:memory ratio required would be 1:2, thereby allowing full use of the external memory bandwidth. There is 64KB of LDS per CU, and the AMD GCN architecture docs indicate the LDS can be globally addressed; i.e. one CU can access the LDS of another CU. However the syntax of OpenCL does not permit the local memory of one work-group to be accessed by a different work-group. There is only 64KB of GDS shared by all CUs, and even if the row counters could be stored in such a small amount of memory, OpenCL does not have any concept of GDS.
This likely means writing a top performance ZEC miner for AMD is the domain of someone who codes in GCN assembler. Canis lupus?
This is what I was trying to achieve in that snippet I sent you a while ago, the coalesced write thing. I just lacked all the theory behind it
Post
Topic
BoardMining (Altcoins)
Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070
by
Genoil
on 15/11/2016, 13:03:06 UTC
I merged the low CPU Nvidia patch into my Windows branch. Don't have the hardware to verify correct functionality.
Also submitted a tiny (potential) speed bump for all cards that don't have 36 compute units.
I'm no longer maintaining this fork. After the great Claymore-exodus and Hardforks, the motivation to work on it dropped to the level of the amount of donations I was still getting in. It's been great fun though, wouldn't have wanted to miss it for the world. .
So, no API. I doubt anyone else, other than @nerdralph perhaps, will fork on and pick up feature development.
Post
Topic
BoardMining (Altcoins)
Re: limits of ZEC mining
by
Genoil
on 14/11/2016, 18:59:23 UTC
Dude where is your own miner .
Next coin I expect you to be one of the top dogs in the pit
On a more serious note: if you state the performance is now at 80% of theoretical maximum, we're basically there, right? ETH miners also peak at about 80-85% of the theoretical maximum. Does the same rule apply here?
Post
Topic
BoardMining (Altcoins)
Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070
can you please post the complete kernel.cu somehow your snippets aren't as complete as I thought on first spot.Getting this error:
Error 3 error : no operator "^" matches these operands X:\Mining\sources\nheqminer-cuda-silentarmy\cuda_silentarmy\kernel.cu 496 1 cuda_silentarmy
uint4 loada = *(__global uint4 *)((__global char *)a + 4); uint4 loadb = *(__global uint4 *)((__global char *)b + 4); uint4 stor = loada ^ loadb; Or wasn't it supposed for your cuda port?
It's for opencl. Cuda have not native 128bit xor (don't know about amd, and future cards). For cuda you can test uint4 stor; stor.x = loada.x ^ loadb.x; stor.y = loada.y ^ loadb.y;
Thanks a lot for your work, Genoil! I think some people feel more comfortable using your binary. I am trying to remove the runtime Cygwin/Python dependencies entirely by compiling silentarmy. We will see...
Thank you for stepping in! (btw it's still a py script and a solver.exe. not updating my own mining client until the updating dust dust settles a bit )
Post
Topic
BoardMining (Altcoins)
Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070