I don't know why some blocks have just a few transaction and other 1000+, it is weird in my opinion but it does happen.
Well the answer to your question has been outlined in several of the posts in this thread: It's called variance. Miners try to solve/find a block as fast as possible. It makes no sense for them to wait and include a given amount of transactions. Adding transactions only makes the block bigger (thus it becomes more likely for the block to become orphaned) and waiting for more transactions to come in has an incredibly negative EV.
What is an orphaned block? Is that one where there are several fringes and that block ends up on a fringe that gets left because another was solved first?