well i had used LX150_test. and i dnt have a lx150 dev board, so i think i will just share my ideas here..
Oops, sorry, LX150_Test isn't really usable at the moment. I really need to add a useful README outlining all those different project variations ...
Thank you for contributing your idea!
Please take a look at the project variation I linked:
https://github.com/progranism/Open-Source-FPGA-Bitcoin-Miner/tree/master/projects/LX150_makomk_TestYou will find that your idea, for the most part, has already been implemented in there. Specifically look around
this line.
BUT: You did point something out that I think I missed. In the code I linked you'll see that the pre-calculated T1 value is stored in a separate register, not tx_state[7] as you listed in your example. On looking at my code, I believe you are
correct; tx_state[7] is never used (except for the last round) so it could be removed or replaced with the partial calculation. Good catch, Anoynomous!
Not sure if the compiler catches this optimization automatically or not.
again s0_w can be calculated a loop ahead and added to rx_w[31:0]. this way our new_w will be shortened to:
Now that, I hadn't thought of. Another fantastic catch, Anoynomous!
Double check me on this:
tx_pre_w <= s0(rx_w[2]) + rx_w[1]; // Calculate the next round's s0 + the next round's w[0].
tx_new_w <= s1(rx_w[14]) + rx_w[9] + rx_pre_w;
if the above solution is applied, the calculation of new_w will be the new critical path...
The calculation of tx_state[0] is the current critical path:
t1 = rx_t1_part + e1_w + ch_w
tx_state[0] <= t1 + e0_w + maj_w;
Which is actually pretty good, since it's implemented as only two adders.