ok got passed OpenCL compiler error, now this
( replaced
// uint4 q[2] = {0, 0};
uint4 q[2] = {0, 0, 0, 0};
)
yep that's experimental

initializing uint4 as uint2 is indeed experimental
(however it isn't obvious that opencl and cuda use the same definition for these structures...)