or they could have been running far longer than you have and have historic transactions in their mempool.
Yes, before running the instance I let my bitcoin node run for some time until there are some blocks not completely full and I also check that my mempool have more or less the same number of transactions than others like this: https://mempool.observer/
So, the question would not only be: which pool makes the most optimal blocks, but also: which full node implementation and which settings makes the most optimal blocks