sad we cannot see the diff to JLP's original kangaroo tool. Anyone dived into mikorists' code and can share some thoughts about the changes and implementation routines ?

I didn't have time to go through all the files, but I did go through the SECPK1 folder. Everything has been changed to work for 256bit if the CPU is used. I think this version is about 15-20% faster than the original in CPU mode. I didn't look at how the GPU is used. This is a mixed code between ZenulAbidin & AlbertTajuelo.