But true, there's definitely no beating a hand-tuned optimized C/assembly implementation of anything!
I should add, this isn't just a hand-tuned, optimized implementation of general ECDSA signature verification: it's hand-tuned and optimized for the
specific elliptic curve that Bitcoin uses (secp256k1). The curve is actually one of the simpler ones blessed by NIST, and sipa is an optimization ninja