Re: SHA256d IC design question

Quote from: HyperMega on March 26, 2018, 07:15:38 PM

These numbers are not based on completely ideal assumptions. They are based on the fact that the part of the pipeline, which outputs could be reused by other cores, counts for about 25% of the overall core logic of a single core.

Ok, you are right, the FO/load cap of the reused bits is increased by feeding multiple cores. But the reused outputs are only 32 bits in contrast to a 512 bit wide pipeline without increased FO, implemented only once.

I haven't read the full patent application, but I understand how they are written with a goal of withstanding claim/counter-claim adversarial legal system in the USA and other anglophone countries. So I can confidently repeat: you are wrong, these numbers intentionally use idealized, abstract algebraic models to make a strong patent application. The whitepaper is just a marketing brief for the patent. This isn't a scientific report in the applied science field.

In the next paragraph you use the term "512-bit wide pipeline". This is just such a nice marketing speak. SHA256 is actually a 16-stage 32-bit wide shift register with some fancy feedback terms. The re-invention of it as 16*32=512 bit vector pipeline is nothing more than a workaround in for the bugs/design flaws in the front-end Verilog tools used preferably in the West Coast of the USA. If the design was done in VHDL (as preferred by East Coast USA boutiques) there would be no need for that trick of making 32-bit slices out of 512-bit vector.

No matter which front-end was used the actual physical layout is very far from the neatness associated with the word "pipeline" and how e.g. AMD/Intel use it in theirs marketing literature and die photos.

The physical layout of such designed unrolled mining engine very much resembles the snake pit like one used in my avatar. That happens because the heuristic layout optimization tools cannot find any useful gradient to optimize for, fail to converge or converge extremely slowly resulting with semi-random rats nest of long traces.

Quote from: HyperMega on March 26, 2018, 07:15:38 PM

So the gain of an ASICboost duo-core in terms of power efficiency will be a bit less than 12.5%, but not much.

I cut this paragraph into a separate quote because it is a beautiful sample of USDA prime marketing baloney.

Firstly duo-core was just a sample on the whitepaper, the Halong's implementation is quad-core. So it is 18.75% not 12.5%.

Secondly, you use values of bit much less than 2. Such a nice English creative writing trick. How do you values of "bit" compare with manufacturing tolerances which are about +/-20%?

Thirdly, it not about just (A) reduction of power use. You neglected to mention:

B) lower clock speed due to need to keep nearly four times larger area that needs to be kept in lockstep;
C) lower yield because the area of mutually dependent logic is increased nearly four-fold.

It is quite an achievement in marketing to squeeze 3-way deception into a single sentence. You must be a professional.

Finally, whatever one can say about Bitmain's chip that is ASIC-boost capable, at least it is somewhat honest in implementing switchable levels of ASIC-boost. One could actually measure the actual gains or loses from various levels of boosting and compare them with the table of theoretical values. It isn't as perfect an experiment as designing separate chips for each level of boosting, but a better scientific compromise.

All I can guess about Halong's chip is that it's design was worked out as some sort of political compromise or attack/defense strategy. I'm definitely not up to speed on the factions currently involved in the Bitcoin internecine warfare.