this is not 1 ExaKey/s this like 1 Exa Compares per seconds
Compare benchmarks and stats are no big deal to be honest, because pretty much all hardware have dedicated paths in the transistors for performing numeric comparisons, so of course it will be much faster if you just measure how many times it runs CMP ... JNZ. Keys calculation on the other hand can not be hardware accelerated with general purpose hardware.
Which leads me to ask, Alberto what loop are you measuring as a key?