rule that says you must put at least 10,000 transactions in each block.
This is not needed because Tx fees guarantee that it is in the interests of miners to put as many transactions as possible into a block.
Is there any reason this wouldn't work?
What you're trying to ask is why is there a limit to block size. There are two reasons. First, the Bitcoin network needs to be able to guarantee that consensus will be reached within 10 minutes. This is harder than it sounds because Bitcoin is a peer-to-peer relay network, which is very slow compared to the centralized, YouTube-style networking that we are used to, where huge caches of data are stored at a physically nearby server just waiting to be served up to anyone in the surrounding region at super-low latency and virtually unlimited bandwidth. Relay networks, by comparison, have unpredictable latency and bandwidth (network-wide throughput) is not guaranteed. In principle, the Bitcoin network could probably run a dozen times faster with 90+% reliability. But while 90+% reliability is good enough for serving up video where the worst-case scenario is that a few frames get dropped, this does not work for Bitcoin because a "dropped block" really represents a network-wide fork. So, the 10-minute rule creates tons and tons of padding to ensure that the blockchain reaches consensus on every single block. There are still occasional "orphan blocks" where the miners generate two different, valid blocks almost simultaneously. When this happens, one or the other block will generally win out after another 10 minutes when the next block gets generated. The network can support multiple instances of orphan-blocks; however, the probability of multiple orphan blocks goes asymptotically to zero with each additional block. The reason we can be sure that this asymptotic behavior holds is that the network makes sure there is lots and lots of time for the network to settle on consensus with each block added to the blockchain. If the blocks were large enough, nodes could not process them in time to make sure that this guarantee holds. If it takes a typical full node 11 minutes to process each block as it is mined (including network bandwidth+latency), the network would "fall behind" the miners and would no longer be able to reach consensus. But even if the typical full node can process a block in less than 10 minutes, orphan blocks are still at risk of creating a situation where the network can no longer reach consensus on which proof-of-work chain is the true chain.
Second, because the Bitcoin network is a peer-to-peer network, nodes are free to leave and re-join at will. The time to re-join the network is a linear function of the size of the blockchain (since the Genesis block). Right now, that is 140GB and I think it takes more than 24 hours to sync a full-node on a typical desktop PC. The blockchain will continue to grow at a rate of about 50-100GB per year with the current blocksize limitations; while the sync-time will grow in direct proportion to this, at least we know how much it will grow. If this were unrestricted, the sync time could actually fall behind an "event horizon" where it takes longer than 24 hours to process 144 blocks (24 hours worth of blocks), meaning no new node could ever join the network!