Can you please explain the advantage of not including transactions into a block?
Sure. Visa currently processes 1,2 million transactions every 10 minutes. Assume we approached 10% of this volume. At an assumed average transaction size of 500 bytes, a block would have to be 60 Mb heavy. It requires a lot of computation power to verify every transaction and a lot of network resources to spread the block. So a few miners start to ignore some of the transactions. This will increase the load on the next blocks, creating even more incentive to exclude transactions, and so on. We will start to deal with the Prisoner's dilemma/Tragedy of Commons sort of problem here, with Nash equilibrium inclining miners towards NOT to include transactions.