This is expected. As mentioned, sse2_64 only works on 64-bit Linux at the moment, so this high speed is not available under Windows.
For those who want to take on a challenge, what you need to do to allow sse2_64 to run on Win64 boxes is to change the assembly to handle the Windows ABI for x86_64.
This is a useful starting point:
https://secure.wikimedia.org/wikipedia/en/wiki/X86_calling_conventions#Table_of_x86_Calling_Conventions.5B1.5DSince I don't do Windows development, I'm not going to port it. However, I'll look at pull requests on my SSE2 branch.