The ByteReverse macro should probably be skipped before doing SHA-256 transforms.
before and after? theres several ByteReverse calls that probably need removal for the nonce ant the timestamp also.
in fact you may be able to do away completely with the temp block header thats mostly just there so it can be ByteReversed.
I think all of them can go away since SHA-256 expects it's bytestream to be big-endian. The fastest way to find out I think is to run the code through a debugger on both a BE and LE machine at the same time and compare results at every step.