Why wait 95% on a soft-fork? What happens if segwit miners starts mining segwit blocks right now?
Because of the way the soft fork works, unless a consensus is reached, the other nodes will simply reject all those blocks as invalid until the 95% is reached, so if a miner mines segwit blocks they end up being seen as invalid by all the other nodes and all the mining work is wasted. Once another 2 blocks are mined elsewhere on the network, even the miner's node will switch to the non-segwit mined blocks.