Isn't it just about 30-40 bytes per input?
No, those are outputs (34 bytes). Inputs are just over 4 (for compressed addresses) to 5 (for uncompressed addresses) times bigger.
Regarding fee estimates on sites, I get the sense that their "recommended" is usually the highest seen on the network?
Correct. And most wallets also aim to be in the next block with "dynamic fees". The result is that "recommended fees" go up if the next block isn't found fast enough. In the end, blocks are still full, transactions are still slow, and we all pay more.
So what is the solution for me to get out of this, I'm ready to pay as high as $75 to anybody that can get this sorted out for me