KANGAROO 254 bit
------------------
I spent today whole day to migrate kangaroo 2.2 JLP 125bit to 254 bits
I keep DP only 64bits as original (to avoid huge changes) I think it should be enough? (no need for 254 bit mask?)
It seems like a "huge" success but of course I have some questions about the code, if anybody knows the answer (anyway it behaves in a consistend way)
I noticed that hash in the code is just bits 191..128 of x value masked with 17 lsb bits, am I right or I overlooked something?
if the hash is 64bit variable why is it then masked with 17bits lsb?
the code is updated for CPU but GPU should not be affected at all, I have no CUDA compiler so I can not try it to compile for cuda
if anybody is interested I can upload the code on git (but next Tuesday now I rush on holidays....)