forget about cn-gpu crap. It is much better on nvidia. And will be easy for fpga/asic if have any volume
Yup. it's much better on Nvidia but floating point math is not efficient on FPGAs and ASICs.
Are you joking? I've read about fpgas doing scientific floating points math years ago )) It is what they are designed for ))