Submitted another speedup in quark/x11. +50-100KHASH on the gtx 970.
Not sure about the other cards. Please test..
(More of blake512 - 80 was precalculated.)
Speedup is not more then 30-40kH on quark. But lyra2rev2 is sooo good!
Yes I submitted another speedup in lyra2v2 as well. Most of the first round and some of the second round of blake 256 precalculated(100 assembly instructions removed and replaced by 16 32bit constmem reads). The first round of Keccak256 unrolled, and 24 xor instructions removed. (x^0=x)
More speedups are comming