#define SPH_COMPACT_BLAKE_64 1 seems to give just a tiny bit more
Edited all .cl's; running experiment for uptime & stability; will report back.
Appears to add around 50Kh/s on x11.
seems like
#define SPH_LUFFA_PARALLEL 1 = 2%
#define SPH_COMPACT_BLAKE_64 1 = 1%
#define SPH_KECCAK_UNROLL 6 = 1%
substituted loops in groestl.cl = ~5%
I wonder if this is only for us on these Hawaii architectures.
Would it be worth us submitting a pull request to edit/update these four .cl files, or would it break compatibility for Tahitis & Pitcairns?