The difference between an 8 byte salt and a 4 byte salt isn't 4 bytes, and it isn't a factor of 2. It is a factor of over 4 billion. It is the difference between "Got a minute?" and "Sometime after the end of the universe".
In reality, it is probably not that big of a deal. Precomputing AES tables isn't a trivial task, and even 4 or 5 bits of salt is probably enough to make it impractical for well beyond the potential lifetime of one of these protected keys.
But cryptosystems are designed with defense in depth for very good reasons. When a weakness in one part is discovered, it is usually not a catastrophe because we have added redundant security measures.
You make some very good points about this.
In terms of the purpose of the salt, what would be the usage model for any sort of precomputed table for this problem? It seems like it would be limited to creating a list of AES keys derived from possible passwords, each of which would have to be tested, still requiring exponential time in the length of the password, but allowing the attacker to circumvent running the expensive PBKDF function for each test. Is there something sneaker and more devious that an attacker could do with this particular problem? I don't think this is like trying to build a table for a hash function, where the outputs depend on only the password and the salt, in which case the attacker can index the outputs to the inputs and build a true lookup table or a rainbow table.
Anyway, if that's true, then at 48 bytes per AES key + IV, and 2^32 salt variations of each, the attacker would need to be able to store 192GB per precomputed password. At least, without some clever way of compressing the keys. That is a lot easier than 2^64 variants of each, but still seems intractable.
The tradeoff to having four extra bytes of hash would be six extra characters in the encoded output. Example side-by-side is below:
PAYxy8B9zhSGpbyC7Z29KRS7rsPzAcTF3zSPmyAigbAJYamt9GmNwsFT9E95dGfp6Ri
3W3qzi2Sow4o7LP7jwwnrmqiiGxyMcwLdubKxYhNy4J63eigN3jz7avYqSWfGwxz3nMDDXwwcEdit: It's possible, using a smaller value for the type byte, to reduce it to five extra characters.
Your point about weakness mitigation is well taken though, and the tradeoff may be worth it.
EDIT: I would even recommend that fewer bits went into the password check. Like, maybe as few as 8. This will make it even more expensive to brute force passwords, as it will result in an enormous number of false positives that can only be checked by doing a time-consuming EC multiply (to derive the bitcoin address) and a database lookup (to see if there are any coins at that address, only to find there are none). In the event of a typo, even with 8 bits, still a >99% chance it will still be caught. The password check as it stands is probably just a gift to an attacker.
The PBKDF function here requires 8192 rounds of the SHA256 hash function, so it should already be a lot more expensive than computing the bitcoin address and looking it up in a list. Even if the attacker was using a table, having to do an EC operation as often as every 2^8 keys probably won't cause a big slowdown. But, it will make the password cracker a bit more complicated.
For the one person who gets a letter of their password wrong, and is unlucky enough to have it silently accepted and mapped to the wrong account, and either the bitcoin address isn't known or isn't compared to the expected value, they better hope the password-protected version of their key wasn't discarded.