I think I'll mine some SPR on another system while testing. I think that multiply could use some work, and also SHA256, as this algo seems to be heavy on it.
can the speed double ?

I'm not gonna know that until I try some stuff, and see how much time is spent where.