How many cores are in the chip and how many clock cycles does it take to get a result?
Does each core has GPIO or is there some serial but that aggregates them?
Does your QFN48 7x7mm packaging has exposed thermal pad and is it on top or bottom?
1. 756 double sha256 cores. 61+4 kernel (61 clock cycle computation 4 clock cycle load).
2. There's asynchronous 'match' signal - the only thing that core sends out. And some busses to load data.
3. wirebond. die is laid normally in cavity. i.e. it is not flip-chip and not arranged to give heat into anything else, but PCB.
It is actually not complex to dissipate 3W... Maybe even 5W with metal-core PCB and proper cooling. That's what we'll see.