A good start would be a complete CUDA engine with "4.0+" capabilities. What's does that 4.0+ mean? A device that has uncrippled integer performance (my 760 is crippled because only on 3.5 there are shifters???

) that without hardware assist already offers 500Kh/s scrypt for 200 (compared with the 470Kh/s at 170 for R9-270). That '+' means dedicated transistors for one or more SHA2 units (if I have a damn clue what I'm talking about

) and whatever other acceleration possible without being too costly.
Then, resale value and power usage. For resale value, perhaps video out is necessary. At worst I can still sell it as a used videocard, unlike those Tesla server units. Power usage means unlocked undervolting, power/performance profiles, etc...