Lost another s9 board, this one had a lot of HW errors before it went, so at least I knew it was the next in line. Oddly enough it's hashing in my failed board miner at .4TH/s with 27 ASICS visible. Weird failure. I'll keep it powered for a week or so to see if it's condition changes, but it's not worth running as it is.
So far since last September (9 months ago)
Lost 3 R4 boards (out of 12 total) (One was resuscitated by reprogramming the PIC with a copy from a good board, one was repaired by BitmainWarranty, one is left as failed and won't be repaired)
Lost 3 S9 boards (out of 12 total) (One was repaired by BW, two are left as failed and won't be repaired.)
The low quality of their products does extend the ROI payback time quite a bit, so if your numbers are close to the line I'd suggest rethinking their products. A 25% failure rate in 9 months is far too high.
In hindsight, they seem to put a "bad" board into each miner to balance out the good boards. Once you weed out the bad boards and consolidate your good boards the remainder should do just fine.
Seems like Bitmain is happy to sell boards that normal companies wouldn't clear past any reasonable QC. by my estimate 25% of your boards are broken from the date they ship, and should have been rejected in the QC testing phase, if you're willing to accept that, then go for it.