Thanks for publishing your repo! Appreciated.
I'm not a C programmer (or OpenCL for the matter) but I'm a fan of DRY; so when I was reading input.cl I found the get_row() function and I think we can make it a little bit DRYer by doing something like this:
uint get_row(uint round, uint xi0)
{
uint row;
uint swp;
uint num;
#if NR_ROWS_LOG == 14
swp = 0;
#elif NR_ROWS_LOG == 15
swp = 1;
#elif NR_ROWS_LOG == 16
swp = 2;
#else
#error "unsupported NR_ROWS_LOG"
#endif
num = (40 << swp) - 1);
if (!(round % 2))
row = (xi0 & ((num << 8 | 0xff));
else
row = ((xi0 & (num << 16 | 0xf00)) >> 8) | ((xi0 & 0xf0000000) >> 24);
return row;
}
So, what do you think, @zawawa?
I don't know if this can be useful at all, but if you like it I can make a PR so you can merge the changes later.