I'm definitely using all 32, same with my 3950x. As you suggest though, this isn't optimal for all CPUs, my 9900k performs better using 8 threads(as opposed to 16) preferably with affinity set to 1 per physical core.
It seems to be a Ryzen thing where performance is better with all threads used.
Yeah, I forgot Ryzen doubled the cache with Zen2, a 3950x can definitely use all threads.
My only Ryzen is a 1700.
Regarding the gain by fine tuning the DRAM timing, I was thinking more of DRAM OC which only provides
a marginal improvement in the best of cases. I think of DRAM timing as more of a penalty when it's wrong.
But the difference between wrong and right can be significant.