Post
Topic
Board Mining software (miners)
Re: Official CGMINER thread - CPU/GPU miner in C for linux/windows/osx
by
plantucha
on 31/07/2011, 00:36:32 UTC
Hey, I've been working on the hashing asm, as I said before, by removing redundancies of functions and register moves, using logic to modify source and destinations to take advantage of processor hardware optimizations and doing some of the easy math myself so the processor doesn't have to.  Here's what I've done so far.  It's not much, but it works.  Don't go changing the github source just yet though.  For now, copy-paste this to replace your existing sha256_sse4_amd64.asm file.  For those of you without SSE4.1 (such as AMD users), copy paste this into you sse2_amd64 file instead and search-replace all uses of movntdqa with movdqa so the quick memory moves aren't used.


I'll be attacking the LAB_LOOP next.

where is located sse2_amd64 file for AMD users?

in /x86_64
is only:

sha256_sse4_amd64.asm
sha256_xmm_amd64.asm

I don't see anywhere:
sha256_sse2_amd64.asm

Oops, sorry.  The xmm version is what I meant.  I keep thinking sse2 and sse4 for ease of my mind and maintaining difference of programming instructions.  I'm looking for places to implement SSE3 instructions to run math calculations on dwords simultaneously, but I would have to restructure the entire program to take advantage of it and even then I'm not sure if it would work better or worse.

AMD phenom X6

sse2              17.4 MHash/s
fixed sse2      18.0 MHash/s
4way              20.4 MHash/s
sse4               illegal instruction


edit:
4way works in 1.4.down 
in 1.5.up works too (same speed), but everything is rejected