Your explanation for why it's actually 2^256 is quite clear - however to brute force we would still need to go for the full 2^264 route since we cannot know if a phrase would result in a valid checksum, correct?
It's true that you will need to check all the 2^264 combinations to see if they pass the checksum, but take note that you won't need to generate address from all those combinations.
You will need to generate address from 2^256 combinations.
The process of generating addresses from the seed phrase is much more expensive than just checking the checksum.