CUDA Optimization would be #1, followed by Stratum support then failover in my opinion.
I concur with this ranking.
I use the Linux client on most of my cards. The only exception is the K20's when they are free. 330 Mh/s on btc is still better there.