ok, so I'm not crazy yeah, pre-routing got me to 140% LUTs pre-routed, and the ISE has been at the map stage for something like 18 hours so far.... gonna let it run, I got a quad core, so doing parallel compiles is feasible. The first global placement run took 4 hours for me.... and it's been stuck on the 2nd global placement run now - probably around 19 hours total running time so far... what's sad is this is the Map phase... Place and Route has yet to run =(
The other thing I've been told is your FPGA should never really exceed 60-70% usage pre-routing.... because a lot of resources are needed to get high-speed routing done... And trying to pack 2 engines in there is likely nearing more like 90% usage...
in terms of pipelining, it's not so much the Hasher blocks I dont understand, it's the signals feeding into them.... For example, what are the different length shift registers for? What is the definition of cur_w# and why do they need different lengths, or more specifically what do the specific lengths correlate to? The previous hasher's output? And on a fully unrolled loop - why are shift registers even necessary? Shouldn't each hasher's digester essentially have the "register" of the state in there?
edit: ohh wait nm, that's the message scheduler!
Maybe not a full block diagram outling all the pipelined stages, but more of a "cell" diagram of a Hasher in terms of i. E.g. Hasher i has connected to it's input Hasher i-1's cur_w0. Something like that might help me figure out exactly what's going on.
And I guess that loops into your question on specifying signal names. Personally I would have some sort of prepend to every wire/reg. In VHDL there is no distinction and the behavior ( wire or reg ) is inferred through the design - e.g. signal is assigned a value in a clocked process ==> register. And in VHDL I usually prepend all my signal names with sig_XXXXX. One of the problems I have with single letter variable names is that they are impossible to search the document for references. So you have a variable K - want to see how hard it is to search a document for references to the letter K? If every K was instead sig_K, it would be much easier to search the document to find references. Basically any single letter variable name IMO is bad.
Some of the other signal names might be a mix of non-detailed name + my inexperience with the SHA algorithm. For example, wtf does cur_w1 mean? I understand a _fb = feedback. But I don't know what w1 or w0 or w14 or w9 do. Also, I'm unsure what a _w means, or a _w1, or a _t1. Or a prepend of cur_ - not exactly sure what that means either.
And although it may be easy and quick to type, the shift register definition also has 2 single letter registers, r and m - and this one isn't as bad because that stuff is internal, but imagine what a pain in the ass it is when you get a synthesis info/warning about some variable m - now I gotta search through all the source files by hand to look for a register m - because I can't just search for "m" in all the documents and get anything useful...
It might also help to organize the wire/reg definitions a little bit better. The way it is now, definitions are strewn throughout the code. I always prefer having my wire/reg/input/output declarations at the top of the module, like software coding. It may also help to separate out the modules a little bit more. The sha256_transform is so complicated already - maybe move all things like the digester or shift registers out of the same source file, that way the root sha256_transform module is more of a connectivity/hierarchy module defining the structure, not the fuction of the sha256 transform.
But truthfully my understanding of the sha256 algorithm and it's pipelined version are probably a little bit lacking, and that is not helping to understand the code/flow.