The ASIC can choose to use DRAM instead and amortize the power consumption over the row buffer.
Here's a possible concrete design of what you are hinting at:
The ASIC will have as many (pipelined) siphash circuits as are needed to max out the memory
bandwidth of all the DRAM you're gonna hook up to it.
For each row on each memory bank, the ASIC has a queue of in-row counter-indices.
Each entry in the queue thus represents a "stalled thread".
The siphash circuits fill up these queues at a mostly uniform rate.
Once a queue fills up it is flushed to one of the memory controllers on the ASIC that
will perform the corresponding atomic updates on the memory row.
This looks pretty efficient, but all those queues together actually take up a lot of memory,
which the ASIC needs to access randomly.
So it seems we've just recreated the problem we were trying to solve.
But we can aim for the total queue memory to be maybe 1% of the total DRAM.
There's a lot of parameters here and many different ways to optimize with
combinations of speed, energy cost, and hardware cost...
Right we can't queue any where near 2^14 counter indices per row, else nothing gained. Good point.
If the compute units are much more efficient and faster than we can fully leverage within the memory bandwidth and optimal counter indices queue limits, we could also consider enumerate multiple nonces and discarding those that don't match a subset of page rows, i.e. process the cuckoo table in chunks with multiple passes. So that decreases the effective memory size for each pass. The edge pruning would need to be done after all passes. Can't the pruning be done without random access?
Basically what I am driving at is we can't rely on the row buffer load power cost being greater than only 1 hash. We'd need wider margins to be very sure of ASIC resistance, such as 1024 row buffer loads for 1 hash.

tromp you are a smart guy, so please don't go publishing my idea before I get to market. You might be able to deduce what I am up to based on the prior paragraph.
You'll receive proper credit as you deserve it. This is your invention.
Don't ruin your future fame by deducing and publishing what I am up to prematurely, given the liberal hints I have given you and then ruin the ability to leverage in the market with first mover advantage. Some shitcoin take the idea and run with it, then the impact will be diluted. Please for the love of God, don't have start having delusions that Blockstream is your friend. Do you enjoy being used.