Post
Topic
Board Mining (Altcoins)
Re: RANDOM-X on XEON... CACHE, FREQ'S OR CORES?
by
wacko
on 12/03/2021, 14:03:47 UTC
It's something to do with L3 cache on Intel CPUs, it doesn't scale well with more cores. I observed similar problems with 6-core vs 4-core Xeons.
What could it have to do with L3 cache? RandomX needs 256KB of L2 and 2MB of L3 for every thread. None of the Ivy Bridge EP CPUs are limited by L3 as far as I can see, every single E5 v2 CPU has more than 2MB of L3 per core, so should scale with increased core count just fine. I'm looking at 4-cores vs 6-cores v2, and the scale is pretty much linear.

Here's a quad 2637v2 (3.5GHz base, 3.6GHz turbo):
https://xmrig.com/benchmark?cpu=Intel%28R%29+Xeon%28R%29+CPU+E5-2637+v2+%40+3.50GHz
And here's a hexa 2643v2 (also 3.5GHz base, 3.6GHz turbo):
https://xmrig.com/benchmark?cpu=Intel%28R%29+Xeon%28R%29+CPU+E5-2643+v2+%40+3.50GHz

Each thread hashes at ~ 550-600 H/s, resulting in ~ 4.5Kh for a pair of quads and 6.8-7Kh for a pair of hexa CPUs. Compare these to the 8-core 2667V2, and it also does ~ 550H per thread, or 8.7-9KH for a pair:
https://xmrig.com/benchmark?cpu=Intel%28R%29+Xeon%28R%29+CPU+E5-2667+v2+%40+3.30GHz

None of the 10-core Ivy EP CPUs are clocked as high as 3.6 GHz, so no direct comparison can be done here, the fastest 2690V2 is 3.0GHz base / 3.3GHz turbo, and it does 460H per thread, or 9.2KH for a pair:
https://xmrig.com/benchmark?cpu=Intel%28R%29+Xeon%28R%29+CPU+E5-2690+v2+%40+3.00GHz
So it also seems to scale, especially if the turbo wasn't working right on that one (and the memory was at 1066MHz). A slower 2680v2 10-core does up to 515H per thread, and it's only 2.8/3.1GHz cpu:
https://xmrig.com/benchmark?cpu=Intel%28R%29+Xeon%28R%29+CPU+E5-2680+v2+%40+2.80GHz

So at least up to 10 cores the scaling looks pretty much linear. It's the 12-core CPUs that aren't any faster for some reason, at least by looking at those benchmarks. 2696V2 is supposed to work at the same 3.1GHz turbo as 2680V2. If the latter does 10KH for a pair, then why wouldn't a pair of 2696V2 do 12KH? Do they throttle and not reach 3.1GHz all-core turbo? It's only 5W difference in TDP on paper between 2696v2 and 2680v2, but most likely a bit more in real power draw. Too bad xmrig benchmarks don't show the actual clocks. Sad