I see you're started with the modification. Just to let you know in advance, I don't use #include "p2pkh_decoder.h" or p2pkh_decoder. Using them costs around 10-15 Mkeys/s in performance. Comparing HASH160 directly is much faster than decoding to a P2PKH address and then comparing. Additionally, it will be even slower if the comparison is based on the decoded address.
So, the fastest way would be to
compare public addresses directly without any hashing?
Yes, if the script is almost the same, then it compares the first or last 4 bytes of the public key, similar to this: dividing the batch into 512 keys, SIMD, etc. And it runs on my processor at over 300 million keys per second.