Well krnlx has a cuda port of it so I assume its in the kernel.cu since ther eis no input.cl in his cuda implementation. Or I'm blind since i found these functions in the kernel .cu too,
ah.. got it. as I have no cuda (as I'm using amd opencl drivers) I've just adjusted input.cl in my case. Sorry for confusing.