Post
Topic
Board Mining software (miners)
Re: further improved phatk OpenCL Kernel (> 4% increase) for Phoenix - 2011-08-04
by
Phateus
on 09/08/2011, 20:20:46 UTC


1. AND the low 16-bits of H against the high 16 bits
2. Take the resulting 16-bit number and OR the low 8 bits against the high 8-bits
3. Take the resulting 8-bit number and OR the low 4 bits against the high 4-bits
4. Take the resulting 4-bit number and OR the low 2 bits against the high 2-bits
5. Take the resulting 2-bit number and NOR the first bit against the second bit

6. do bitwise AND of the resulting 1-bit number against the nonce
7. take the result from #6 and XOR the low 16-bits against the high 16-bits
8. take the resulting 16-bit number from #7 and OR the low 8-bits against the high 8-bits
9. store the result by doing output[OUTPUT_SIZE] = OUTPUT[result of #8] = nonce

Steps 1-5 create a single bit indicating if the nonce meets H == 0. When you bitwise AND this against the nonce in step 6 you will get 0 for any invalid nonces and for valid nonces you will just get the nonce again. (1 AND X = X)


I don't claim to understand this, but step (1) should be an OR, not an AND.


Yeah that's right. Must have missed that when I went over the post. I had it correct in the example though.

I tried to implement this, but the kernel only crashes the display driver THAT hard, I get a Bluescreen everytime ... weird.

Code:
// Round 124
Vals[7] += Vals[3] + P4(124) + P3(124) + P1(124) + P2(124) + ch(124) + s1(124) + H[7];

...

// lo 16 Bits OR hi 16 Bits
uint positive = (Vals[7].x & 0x0000FFFFU) | (Vals[7].x & 0xFFFF0000U);
// lo 8 Bits OR hi 8 Bits
positive = (positive & 0x00FFU) | (positive & 0xFF00U);
// lo 4 Bits OR hi 4 Bits
positive = (positive & 0x0FU) | (positive & 0xF0U);
// lo 2 Bits OR hi 2 Bits
positive = (positive & 0x3U) | (positive & 0xCU);
// lo 1 Bit NOR hi 1 Bit
positive = ~((positive & 0x1U) | (positive & 0x2U));

// nonce AND positive
uint position = W_3.x & positive;
// lo 16 Bits XOR hi 16 Bits
position = (position & 0x0000FFFFU) ^ (position & 0xFFFF0000U);
// lo 8 Bits OR hi 8 Bits
position = (position & 0x00FFU) | (position & 0xFF00U);

output[position] = W_3.x;

Dia

You need to shift the the bits for each stage:

For example, oring the top bits to the bottom bits should be:

Code:
uint positive = (Vals[7].x & 0x0000FFFFU) | ((Vals[7].x & 0xFFFF0000U) >> 16);
or just:
Code:
uint positive = (Vals[7].x & 0x0000FFFFU) | (Vals[7].x >> 16);
because the upper 16 bits will already be 0 because of the shift;

Otherwise, you will just get the original Vals[7] value;
if you want to do it that way, the code would be:
Code:
uint positive = (Vals[7].x & 0x0000FFFFU) | (Vals[7].x >> 16);
// lo 8 Bits OR hi 8 Bits
positive = (positive & 0x00FFU) | (positive >> 8);
// lo 4 Bits OR hi 4 Bits
positive = (positive & 0x0FU) | (positive >> 4);
// lo 2 Bits OR hi 2 Bits
positive = (positive & 0x3U) | (positive >> 2);
// lo 1 Bit NOR hi 1 Bit
positive = ~((positive & 0x1U) | (positive >> 1));

However, similar to what I said earlier, the following code does the same thing:
Code:
uint positive = 0xFFFFFFFF + min(Vals[7], 1u);
if Vals[7] ==0, then min(Vals[7], 1u) == 0, otherwise it equals 1
0xFFFFFFFF + 0 = 0xFFFFFFFF
0xFFFFFFFF + 1 = 0


oh yeah...  you are getting blue screens because your address would be a random 32 bit number and it was probably trying to access memory that your video card doesn't have