it appears to be that the multiplication has the secp256k1 characteristic modulus hardcoded into the algorithm, making it suitable only for public key multiplication.
I was thinking about making an adapted version of the multiplication algorithm that uses the curve order (subtracted from 2^256) in its place, so that private key multiplication is covered as well.
I haven't explored the 10x26 legs imply too deep, but I'm assuming it's a similar case.
Since you're here though, let me ask: Was the multiplication assembly (and possibly the C version of it in another file using int128) only intended for public keys?
You have to distinguish between:
1) field operations (mod p, space of coordinates x and y):
https://github.com/bitcoin-core/secp256k1/tree/master/src/ all files with name : field*
and
2) scalar operations (mod n, space of private keys):
https://github.com/bitcoin-core/secp256k1/tree/master/src/ all files with name : scalar*
...
Alright.
There's one other thing to address: in the secp256k1_fe_mul (or something like that) function, the all but the last leg are multiplied by the constant R. This causes a result different from when I calculated an example in Python. So inside the fe_mul function, I need to modify it to avoid multiplying the values in the result (stack) by R, and send that multiplication to a temporary instead.