Every call into one of the C++ methods I provided does 1 native call, so there's 1 transition per call.... unless you count a call into the RNG. Same with the p/invoke code you posted.
Your p/invoke is awesome

It'll work just as good as my library. Just be careful. I made the C++ library so that I could write "guard" code around the calls. ie: make sure keys/signatures/messages are the right length (if not, they cause access violations), handle the nonce so I don't screw it up elsewhere, etc. If you don't need all that junk, then cool

No guards, no nothing, that's correct. but you could write a nice little C# wrapper for it.

Just for the fun of it, would you mind running your .dll through reflector to get a peek at the generated code?