1) I noticed that the version number could be different for blocks. How could we introduce a new version number by soft forking?
A soft fork isn't required to use a different version number. Some numbers cannot be used, but, for the most part, miners can set whatever block version number they want. This is what allows them to use the version number to signal soft fork readiness and also allows them (unfortunately) to use asicboost.
Does this mean that version number actually is not part of the consensus rule, since miners could set whatever version number they want?