I agree with Hatchy, two speed options would be enough. Fast and slow. A "fast" option would mean paying a premium fee, ensuring inclusion in the next few blocks. A "slow" option would mean being part of the regular batched transactions.
The second suggestion, using multiple hot wallets, may not be an ideal solution because it increases the attack surface and adds layers of complexity. I think better UTXO management would be a smarter solution. By creating multiple, roughly equal-sized UTXOs, you can select the most appropriate number of UTXOs for the exact amount being withdrawn, minimizing the size of the transaction (and thus the fee) for a given speed. This will also allow you to service a "fast" withdrawal without necessarily needing to wait for a change output from a larger, slower batch transaction.
I think most of the exchanges use consolidation of UTXOs to reduce the number of inputs so that it reduces the transaction size. But even if I run something like deconsolidation, it would still put a heavy load on my pocket by doing that periodically. I have to spend more network fees to keep a certain size of UTXOs. Even if I do that, it never guarantees that the user would request the same amount of withdrawal, and I would have to use UTXOs reserved for medium so that I can allow withdrawal for fast users or put the fast users on wait based on withdrawal time and process the medium user's withdrawal, because technically medium user requested before the fast user and we’ve got no UTXOs. If I don't follow this rule, those who are on medium would stay pending until all fast withdrawals are processed, and if there are thousands of withdrawals, it could even take days for a medium withdrawal to be processed.
Maybe you didnt understand what I meant. Im not suggesting a constant, periodic deconsolidation that would rack up fees. My idea is more about intelligently structuring your UTXOs during the initial consolidation process, rather than just focusing strictly on making everything one big input.
When a withdrawal request comes in for a specific amount, if you only have large UTXOs, you are forced to spend one, get a large change output, and then potentially wait for that change to confirm before you can send another fast withdrawal right after. This is where the bottleneck for "fast" withdrawals comes in.
But if you break up your wallet balance into a bunch of smaller, equal-sized UTXOs at first (say ten 0.1 BTC inputs instead of a single 1 BTC input), you get way more flexibility. If someone wants 0.3 BTC withdrawal, you just grab three of those 0.1 inputs, keep the transaction small and dont have as much change output to deal with. This means you dont have to wait as long for changes to confirm before youve got more accessible money to handle the next withdrawal. The core idea here is to minimize the need for change outputs.