Also, I don't see the point of saving keys that have some 0-bits prefix. It saves some kernel instructions to simply skip that check, and use the hash target itself as PoW evidence.
I don't see how checking against 0 gives anything better/worse than checking against target ?
Which kernel instructions do you skip checking against the target ? You would still have to do the CMP. Sure you can mutualize the first 32 bits to check, but then you need to check a subset of the next 32 bits and you're introducing complexity, branching, etc...
If anything checking against zero frees the compiler from the dependency to the target registers on this instruction.
Because you're doing two checks (target and zero). There's also no need to check more than a single register in the kernel itself or complicate things.