I believe the Hashfast design ended up in the hands of their silicon integrator, which did the design work in the first place. Similarly, Terrahash went bankrupt, so their design IP ended up in play and is probably held by someone who assigns it a modest value. BFL - I don't know what became of them at all, but they had a design.
The catch is that all of these designs were laid out using VHDL to standard cell libraries.
Bitfury clearly demonstrated that laying out at the transistor level gave massive advantage for power efficiency. His 64 nm chip performed better than the 28 nm generation. It was also buggy as hell.
I don't think it's feasible to deliver a power competitive design with standard cells. You will need to start with transistor level design of an unrolled hashing core. From there, it's likely there are power optimizations that are possible. The design will likely need to be optimized thermally as well, to limit hot spots.
Delivering a working SHA256 hash core isn't all that hard. Being competitive from a power efficiency standpoint will be difficult. I doubt it's practical to expect you will be within 20% of Bitmain on a given node until your 3rd or 4th generation.
Good luck, but I think you will find your money would be better invested convincing a major player like AMD or NVIDIA to develop a solution.