I have personally designed an X11 ASIC - well the RTL only, which can be used for either FPGA or ASIC implementation. It includes a mechanism to account for variations in performance between each hash type. Unfortunately the economics are just not there yet to get this turned into a layout.
This is way beyond anything we did, I would love to see how far you have got, do you have the program or the simulator results, what
sort of speeds were predicted, could you tell ?
As this work was financed, all I can do is provide a sanitized screenshot. This is a simulation of the FPGA version of the design using a UART interface:
http://imgur.com/iD2nqsmAlso note that my design has a number of compile time parameters/constants which affect the performance/throughput. The above simulation is for a design with basic settings only and a nominal clock rate, so the time intervals shown do not reflect expected performance.