I think the current design already incentivize smaller blocks: Smaller blocks get broadcasted much faster and become less possible to be orphaned. If you consider that there are 25 coins to compete for, you would like to broadcast your block as fast as possible once you find it
This is correct but as far as I know, the magnitude of this effect is very small, and not enough to keep block size in check.
I agree. Also with IBLT or other block propagation bandwidth conservation techniques (some of which seem to be used by pool interconnects already) this effect can (and will) be nullified.