..
I figured out how he got 17Gh/s. It's 34 cores 35K luts each with reduced bus width (not the full 1600) operating at 500mhz. He only placed registers at the start of the round, not in the round. That plus some floorplanning to keep the fmax high.. I could probably hit the same number now that I see it
