forgot about #cores, just set 1 wallet, 40 threads and 8 wallets, 40 threads each...
i had best results with 20 threads/core and #wallets=#cores(also HT included)
You calc 1 wallet/40 threads =40 threads vs. 8 wallets/40 threads = 320 threads?
If that is right, what happens if you calc 1 wallet/320 threads?