Energy consumption is massively impacted by the size of the semiconductor die, e.g. 110 nm vs. 50 nm vs. 20 nm. Currently, most ASIC rigs are using 110+ nm chips, but the currently available semiconductor techology available to OEMs is in the 2x-nm class. The 1x-nm class isn't far behind now.
I'm an electrical engineer, so I know about the technological constraints you are talking about. But you are wrongly assuming that the efficiency in MH/J affects overall energy consumption. It does not (at least to my understanding of a free market). More efficient HW will just make the Hashrate go up what in turn will make difficulty level go up.
As long as additional gear can be run profitably, this will be done.
Good point.
However:
It then seems that X years down the road, mining will only be done by massive supercomputers. Or that seems to be the implication to me anyways. Could be wrong, but if the hashrate an difficulty go up like that, we should see ASICs pushed out as being unprofitable the same way that CPU mining has been pushed out, and the way that GPUs older than are not profitable now.
The question then seems to be about the initial cost of hardware? i.e. Are we assuming that the limiting factor will be electricity when the limiting factor in the future may be the cost of hardware?