I haven't followed bitcoin since before ASICs, and I remember when people were trying to squeeze every optimization they could out of the hashing algorithm in the gpu codes.
I didn't know where to put this thread, but my question is about how ASICs are optimized. If I managed to find an extremely good optimization that somehow halved the instruction count needed for a double hash, is that interesting to the ASIC designers? I would think it would save power consumption at least. If not halving, what ratio is interesting? A tenth?