However, there is still a common confusion about SegWit - since blocks are the same and still store all the info necessary to validate transactions, then why does it help to make blocks smaller/fit more transactions? Obviously it's related to the witness data in TX which seems to give ability to reuse some information somehow but as the thread author mentioned it's hard to see how does it do that.
The principal benefit of
that particular aspect of Segwit is that it avoids a hardfork. Hardforks are anathema to a currency which holds huge amounts of real-world value, although there is
Bitcoin hardfork research ongoing to plan how to do one
with absolutely no mistakes if/when it becomes necessary in the relatively far future. Increasing the blocksize limit would have required a hardforkwhere all nodes must be upgraded on deadline, and any which didnt upgrade would suddenly break, badly. But with Segwit, old nodes will continue to see blocks with a 1000000 byte block size limit, whereas new nodes will see blocks with a 4000000 byte block
weight limit. Old nodes can continue to send transactions, their old way; they simply will not know how to validate transactions sent the new way. Its ingenious!
Note that Segwit does not make blocks smaller at all. It
raises the block capacity, up toward a theoretical limit of 4000000 bytes. In practice, best estimates are that with full Segwit adoption, we will get blocks a bit more than 2MB in actual size. It was aimed to be an approximate doubling of block capacity, which so many people had been requesting. Research indicated that such an actual size of blocks would be safe for the network and for nodes, but that much larger sizes would not be.
There are
other important fixes built into Segwitmost importantly, a fix for tx malleability; but the foregoing summarizes the reasoning behind the part of Segwit which magically raises block capacity without instantly breaking anybodys use of Bitcoin.