It would be good if there was an implementation of this available somewhere to try to figure out what the actual speed up could be, as it is not possible, on modern CPU architecture to determine the speed up by counting the number of adds/muls. Unfortunately, I have not been able to identify an implementation.
The custom bitcoin curve code gives a much higher effect without using potentially an insecure technique.
Is it so that sipa's custom code is used in the current implementation ?
Why do you say that the technique is potentially "insecure" ? This would only be used to verify signatures on the blockchain, how can this be "insecure" ?
The signature verification can be tested thoroughly against a reference implementation (OpenSSL and sipa), so the likelihood to incorrectly verify signatures (either declaring them valid, when they are not, or vice-versa) is very low.