Now I am glad I haven't invested any time in my own version of this yet.
Interesting that they chose to put the Fermat test on the GPU instead of the sieving.
Maybe I can learn something from their approach.
In our CoinShield miner it's actually the Fermat test that's more efficient on the CPU,
whereas the sieving was accelerated nicely by the GPU (higher memory bandwidth!)
Christian