It needs ">>(i%32))" I'd say - and I do not know why it works fine on x64, but the change helped to get seemingly proper targets on ARM.
Still no share submitted, so I do not know if it helped.
Thanks for spotting this. I have committed a fix to the xptMiner repository. The performance difference should be insignificant on x64/x86.
I also know there are some other places in the xpt source where memory alignment can make problems. Especially all the xptPacketbuffer_* functions have unaligned read/write access which is supported on x86/x64 but not on other platforms, but I don't know if it matters on ARM.