Maybe I'm not stating my question correctly.
This is entirely a theoretical question, so imagine there are no blocks, but we still do have hashing power on the network (or stake, for a POS chain) in order to be able to reach some kind of consensus.
Under these broad assumptions, is the optimal confirmation time (fastest possible time to form a robust consensus) simply the time it takes for >50% of the hashing power/stake to agree on any given transaction, or is it more complicated than that?
I think you're confusing the term 'verification' with 'confirmation.'
Hashing power is used for finding the block by including transactions and iterating through a random number (nonce) in order to change the hash output to be less than a target value (0x000000000000blahblahblah). We can then verify quickly that the proof-of-work is true (yay math!) so that other miners will accept this block and move onto the next block.
We can't "imagine there are no blocks" without there being a basis for "consensus". From the 0 (genesis block) to the current block we believe all the information to be true because we can openly (and quickly) verify it - that's your consensus.
What I think you may be asking is... the time it takes one miner to find a block and broadcasts it to the network - the time it takes for all nodes to receive this broadcast and verify it so that they may move on to the next block, would the optimum time be for the target value to adjust to >50% of this broadcast/verify time?
Let's say it takes 30 seconds for all miners to receive the broadcast and verify the proof-of-work. Should we adjust the target value so that a block is found every >15 seconds? Is this what you're asking?