It was always detecting thr correct number of total threads. Never mind on the rest.
Just to put some numbers behind my rant: p cores do ~380 h/s independently, and e cores ~170. 4 p cores give me ~1250 and anything more is getting slower. My initail run of 12 alternating cores (out of habit of cours) got me the ~750-800 i mentioned above. I tried differnet combos of p and e, only p, even only e. The 4 core test is always the fastest... confirmed on 2 seperate 12700s.
I still need to the same test for the 12900k. My current affinity config is probably wrong anyway.