ubuntu should work fine with ./buildAll
fedora required a bit of help (see history here).
Also a small progress with ARM miner, investigation led me to the part of code,
I already marked with "?" long time ago when trying to understand some old miner ...
having
uint32* powHashU32 = (uint32*)powHash;
for(uint32 i=0; i<256; i++)
{
mpz_mul_2exp(z_target, z_target, 1);
if( (powHashU32[i/32]>>(i))&1 )
z_target->_mp_d[0]++;
}
It needs ">>(i%32))" I'd say - and I do not know why it works fine on x64, but the change helped to get seemingly proper targets on ARM.
Still no share submitted, so I do not know if it helped.
Thank you. Testing and thinking hard about this - this was part of the code I inherited. I'm going to run it past jh as well, because it likely reflects a bug in the base xptMiner as well. I'll commit this fix tomorrow if all is good.
-Dave
Looks good from here. Committed now and will back it out if jh says it's wrong. Thanks again for spotting this.
I found this here:
http://stackoverflow.com/questions/3394259/weird-behavior-of-right-shift-operatorThe logical right shift (SHR) behaves like a >> (b % 32/64) on x86/x86-64 (Intel #253667, Page 4-404):
The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). A special opcode encoding is provided for a count of 1.
However, on ARM (armv6&7, at least), the logical right-shift (LSR) is implemented as (ARMISA Page A2-6)
(bits(N), bit) LSR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = ZeroExtend(x, shift+N);
result = extended_x;
carry_out = extended_x;
return (result, carry_out);
where (ARMISA Page AppxB-13)
ZeroExtend(x,i) = Replicate('0', i-Len(x)) : x
This guarantees a right shift of ≥32 will produce zero. For example, when this code is run on the iPhone, foo(1,32) will give 0.
These shows shifting a 32-bit integer by ≥32 is non-portable.
So ">> i" may run faster than ">> (i % 32)" in x86 or x86_64 because the % is optimized out, but is not a good idea because it's not portable and also >> with values larger than the operand size are undefined according to the C standard. Since in the miner this loop is done only once for each search of the 256bit nonce, you can do i%32 without any harm.
Maybe a logical AND instead of the % would be faster? of course you should profile instead of believeing me but I think optimizing this is not worth the trouble.
gatra