The next thing I’m going to update is SECP256K1 itself—I’ve already removed some unnecessary files from Git
How fast can this go?

For example, the Ryzen 9 7940HS achieves ~10 MK/s when using 1 thread and ~67 MK/s with 16 threads. Performance also depends on how it is compiled—using GCC, Clang, etc...