Does anyone know how many transactions would fit in a full block under Segwit?
It depends on the transaction size. For a typical, transaction with 1 input and 2 outputs (you have a coin, pay someone and get a change), the legacy virtual size is 225/226 bytes, so 10^6/226=4424 transactions. The same P2SH-P2WPKH segwit transaction is 166 virtual bytes, so 10^6/166=6024 transactions and for pure P2WPKH, the size is 141 vbytes, so 7092 transactions.
The blockchain is filled also with larger transaction (many inputs or multisignatures), so the number of transaction in the typical block is less but the saving are even better with multi inputs. Segwit saves most for multi input transactions, savings are minimal for multi output transactions.
Not a SegWit expert here, so I figured I would just ask. Does SegWit tx go to a separate MemPool or does all txs go to the same MemPool? The SegWit transactions are a lot cheaper than the old legacy during the same time when the MemPool are congested or are they going to the same MemPool, but just handled differently?
Segwit transaction go to the same pool. But since they are effectively smaller (actually they are not but witness data goes to a different space that is larger than 1MB but one can operate on virtual size for which they are smaller), their fee per byte is larger if they pay the same fee. So they are ahead of the same legacy transactions paying the same fee. Or you can lower the fee and have the same probability of inclusion in the block as the legacy but with a smaller fee. Or choose any intermediate fee.