Ok. Let me try to explain some details about how the hashing works, and why either an FPGA or CPU can work, so that we can move beyond this.
In order for the hash engine to work it needs 3 pieces of data:
( midstate, fixed_data, nonce )
Midstate and fixed_data are provided for each work unit by cgminer.
Nonce is a 32 bit range that will counted up through so that every value is hashed, ie max 4 billion odd hashes.
When cgminer sends a work unit to the Avalon it sends some control info (like fan speeds, module number, asic count) and then the (midstate and fixed_data), and then it appends on the starting nonce value for each ASIC in the target chain module. So for Avalon that is 10 copies of the nonce start value. This all comes down USB and into the FTDI chip which sends it into the FPGA.
The FPGA accepts this in a super long shift register and acts on it. It has to send a stream of data for each ASIC in the chain - so that means repeating the ( midstate, fixed_data, start_nonce_value ) serially into the data input of the ASICs. This is quite a bit of data actually:
( 256 bits midstate, 96 bits fixed_data, 32 bits nonce ) x 10 ASICs in chain = 3840 bits.
It holds the midstate+fixed_data and repeats it for each ASIC appending on the nonce_start.
Avalon uses an FPGA because it's doing this process for each of the 32 modules it contains because it is the one central controller in the box.
The way Klondike works is different but results in the same data going into the ASIC.
Klondike has a PIC controller with USB integrated. So it talks to cgminer just like Avalon, but will use a different driver. This driver will send the (midstate, fixed_data) to Klondike via USB and the PIC stores this in it's RAM (44 bytes total). It doesn't need to calculate and send the nonce, the PIC will take care of that. So total data sent for each work unit is 44 bytes.
The PIC will take this data in RAM and serially push it into the ASICs and then append on a nonce start value (by masking the high 4 bits, thus giving it a range equal to 1/16th of the total). It will repeat this for each ASIC in the chain. Since it's only managing it's own module it doesn't have to switch and control 31 other modules like the Avalon FPGA.
The Klondike has 16 ASICs but I have split them into 2 banks of 8 each. This allows pushing the data in twice as fast, and also means if one ASIC is damaged then only 8 cannot function, instead of 16. While the data is pushed into an ASIC the hashes it calculates are invalid, so the faster the new work start data is pushed in the less time the hashing is stalled.
Klondike also performs a few secondary tasks. It sets up a PWM register to control fan speed. It now and then takes a voltage reading off the thermistor or internal sensor. And it also accepts work data from the USB host that is not intended for it's own ASIC chains. A 44 byte work unit can arrive that is for some other module. In this case it simply receives it and sends it right out again on the I2C bus. Since the PIC has a hardware I2C controller this takes very little code or work.
So the same thing happens in both systems but in Avalon the FPGA has to handle 32 times more data than the PIC. With 16 ASICs at 282 MHz a nonce range of 32 bits will take about,
2^32 / 16 / 282,000,000 = 0.95189878 seconds.
The PIC has to receive 44 bytes of data in just under a second for itself, and then repeatedly shift it into the ASICs as initialization for hashing. This should take about 3% or less of it's time depending on how fast the ASIC shifting is done. The other 97% of it's time it's waiting for results, relaying data or fiddling with it's fan. Since most of these are handled by interrupts it's basically idling.
I may stick a 320x240 LCD touch screen on the front of my Klondike master so I can see status. The I2C bus would allow this and give the PIC something to do when idle.