Using double-and-add one multiplication with n requires O(log(n)) additions. I don't think there are more efficient methods.
Is that log2? If so, then a multiply "costs" 128 adds?
Adding another bit to the r-values to determine the points on the curve would be a very elegant solution, something to consider if it provides a significant improvement.
Time is proportional to the number of unique public keys. For bitcoin, all the transactions would have different public keys, so the speed up is at most 2X.