Post
Topic
Board Announcements (Altcoins)
Re: [ANN] [SKC] Skeincoin | Skein-SHA2 | CPU mining | GPU miner available
by
madjihad
on 02/01/2014, 20:22:28 UTC
Yes, that W[] array is moved (by compiler) to registers on GCN, but apparently on VLIW it is not and uses global memory, which is slow. This can be improved of course (and first of all it does not have to be 62 elements long, 16 elements is enough if you reuse them). Just wonder how have you managed to compile sha256_res(sha256_res()): it takes uint16 vector as parameter, but returns only one uint.

I've tried both
Code:
(sha256_res((uint16)sha256_res(as_uint16(skein512_mid_impl(state, msg)))) & 0xf0ffffff)
and
Code:
(sha256_res(sha256_res(as_uint16(skein512_mid_impl(state, msg)))) & 0xf0ffffff)

And it compiles, probably only getting wrong results. But it still enough for test, as sha256_res runs twice, maybe only with wrong input on second run Smiley

Besides, double Skein runs and 780MH/s on 5870, so SHA256 is current bottleneck for sure. With good sha implementation we will be able to reach even better performance, than SHA256D Cheesy