I tested each core separately on the 12700s and concluded that first 16 cores are p-type (with alternating threads) and last 4 are e-type. However when using more than 4 cores the performance only suffers. As of now the best setting is 4 p threads, using only even-numbered ones.
Indeed I use the prebuilt binaries. I'll try compiling myself and see what I get.