OK but I don't understand why you're not releasing the code so that it can be run on Linux.

If (or when) you stop developing, all the optimizations will be lost because nobody else has the code to integrate them into some other fork!
Work in progress. Might release when I am happy with the results. In the next version I will include the ptx assembly code so it might work on compute 8.0 cards. #3 only works for compressed keys, so #4 will include a kernel for non compressed and both as well.