ASIC's (Application-specific integrated circuit's) can only perform one task. But this one task is done extremely efficiently.
You can imagine it like this:
You need to set the power at the correct pins (which basically is the input) and after the electricity has gone trough the hardware you are measuring which bits are set (output; effectively the hash).
The time it needs to calculate a hash is the time the electricity needs to run through the hardware (not exactly, but basically).
Optimization is always good (and desired). But how do you think you have achieved this optimization? Software-wise there isn't much you can do.
You would need to optimize it hardware-wise.