I've got a Stratix IV dev board (the GX 230 model) that I'd try your code on. It seems like the mining program is where more of the inefficiencies lie. I'm running @ 240MHZ and two cores and getting around 200-300 Mhash/second.
Using OrphanGland's code and fpgaminer's mining program.