tried popping in two more sha cores to get 2 engines running ( fully unrolled ), ISE spit out this:
Slice Logic Utilization:
Number of Slice Registers: 92543 out of 184304 50%
Number of Slice LUTs: 121337 out of 92152 131% (*)
Number used as Logic: 113389 out of 92152 123% (*)
Number used as Memory: 7948 out of 21680 36%
Number used as SRL: 7948
so looks like without a lil bit of massaging the current design uses up a bit more resources....
I'm gonna try 2 hashing engines ( 4 cores ) running at log_level2 - that should be able to fit, and then I'll see how fast I can scale up the clocking to get it routable