the flaw of Bitcoin design is not the GPUs, it is the mining pools. They completely invalidate the initial assumption thatr every Bitcoin participator is contributing to the network security. With the pools, only pool owners do. Currently the end miners don't need a Bitcoin client and can't even know for sure which network they are mining for...
There's no need for that, the "deepbit security problem" is only because of an implementation detail. Currently the pool both handles payments and generates getwork, but there's no need for this to be the case. In theory miners can generate work themselves or get it from another node and still mine for the pool. Also, things like p2pool (as a substrate for proxy pools) can do away with the need for giant centralized pools to reduce variance.
Also, to make it easier, we could separate hashing pools from work servers. Pools get signed work units from work servers and pass work from a random source to each miner. Ordinary mining tools can be used, but in order to make sure the pool operator is honest, mining software can support requesting specific channels (round-robin style) or keep a list of signatures in order to verify received work units. This has other advantages like allowing work servers to add a small fee of their own, which would motivate running persistent nodes when running a node becomes a professional business. Most work servers would be operated by respectable Bitcoin businesses (banks, etc.) that would benefit from additional promotion, so fees would probably be close to 0 anyway.
You could do something like that. Have two kinds of blocks, call them "CPU blocks" and "GPU blocks", each using a hashing function with corresponding friendliness. They're mostly equivalent for purposes of confirming transactions and they exist on the same chain (a GPU block can reference a CPU block, etc.). Each type will have its own difficulty, where the difficulties are targeted so that 50% of blocks are GPU and 50% are CPU. There is an additional rule that a block is invalid if the last 8 blocks were the same type as it. So a botnet dominating the CPU blocks or a cluster dominating the GPU blocks can't do much because it can't generate a branch longer than 8 blocks (if someone manages to do both there's still a problem). You can change the recommended waiting period from 6 to 10 blocks.
I think you shouldn't artificially balance targets. They would both converge to a point where both are at the same level of profitability, regardless of the advances in technology. Requiring an alternate every n blocks makes sense though...