Post
Topic
Board Announcements (Altcoins)
Re: [ANN][RIC] Riecoin, new prime numbers POW coin, NEW 0.9.1 CLIENT
by
gatra
on 15/05/2014, 18:44:10 UTC
ubuntu should work fine with ./buildAll

fedora required a bit of help (see history here).

Also a small progress with ARM miner, investigation led me to the part of code,
I already marked with "?" long time ago when trying to understand some old miner ...

having

    uint32* powHashU32 = (uint32*)powHash;

    for(uint32 i=0; i<256; i++)
    {
        mpz_mul_2exp(z_target, z_target, 1);
        if( (powHashU32[i/32]>>(i))&1 )
            z_target->_mp_d[0]++;
    }

It needs ">>(i%32))" I'd say - and I do not know why it works fine on x64, but the change helped to get seemingly proper targets on ARM.
Still no share submitted, so I do not know if it helped.


Thank you.  Testing and thinking hard about this - this was part of the code I inherited.  I'm going to run it past jh as well, because it likely reflects a bug in the base xptMiner as well.  I'll commit this fix tomorrow if all is good.

  -Dave

Looks good from here.  Committed now and will back it out if jh says it's wrong.   Thanks again for spotting this.

I found this here: http://stackoverflow.com/questions/3394259/weird-behavior-of-right-shift-operator

Quote
The logical right shift (SHR) behaves like a >> (b % 32/64) on x86/x86-64 (Intel #253667, Page 4-404):

The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). A special opcode encoding is provided for a count of 1.

However, on ARM (armv6&7, at least), the logical right-shift (LSR) is implemented as (ARMISA Page A2-6)

(bits(N), bit) LSR_C(bits(N) x, integer shift)
    assert shift > 0;
    extended_x = ZeroExtend(x, shift+N);
    result = extended_x;
    carry_out = extended_x;
    return (result, carry_out);
where (ARMISA Page AppxB-13)

ZeroExtend(x,i) = Replicate('0', i-Len(x)) : x
This guarantees a right shift of ≥32 will produce zero. For example, when this code is run on the iPhone, foo(1,32) will give 0.

These shows shifting a 32-bit integer by ≥32 is non-portable.

So ">> i" may run faster than ">> (i % 32)" in x86 or x86_64 because the % is optimized out, but is not a good idea because it's not portable and also >> with values larger than the operand size are undefined according to the C standard. Since in the miner this loop is done only once for each search of the 256bit nonce, you can do i%32 without any harm.
Maybe a logical AND instead of the % would be faster? of course you should profile instead of believeing me but I think optimizing this is not worth the trouble.

gatra