Post
Topic
Board Mining support
Re: T17/S17 malfunction: cases, solutions, remedies, RMA history
by
mikeywith
on 24/11/2020, 10:06:31 UTC
I wouldnt want to use any permeneatny adhesive. Im looking to apply solder. What solder do you know I can use for this that can be taken off and on.

Nothing is permanent, that arctic adhesive won't hold against a hammer or a heat gun direct to that heatsink, the same thing applies to any other solder you might use.


What matters in your kernel log is only these 4 lines:

Code:
1970-01-01 00:01:52 temperature.c:744:get_temp_info: read temp sensor failed: chain = 0, sensor = 0, chip = 14, reg = 0

1970-01-01 00:01:54 temperature.c:744:get_temp_info: read temp sensor failed: chain = 0, sensor = 1, chip = 10, reg = 1

1970-01-01 00:01:54 temperature.c:744:get_temp_info: read temp sensor failed: chain = 0, sensor = 2, chip = 54, reg = 0

1970-01-01 00:01:55 temperature.c:744:get_temp_info: read temp sensor failed: chain = 0, sensor = 3, chip = 50, reg = 0


Quote
I have this same exact error for over 10 boards. Also the the error states chip 50, 54, 10, 14 as being bad chips but I get this same exact error on many boards. Cant all have the same exact chips with the same exact error. What could this be?

The kernel log can be a bit confusing, it isn't saying that those 4 chips are bad, it's only trying to tell you that the temp sensor next to those 4 chips is bad, each board has temp sensors located near the chips mentioned 10,14,50 and 54 something like this




But this isn't even accurate either, because it's unlikely that 4 temp sensors would die, and the real actual cause must be one of two.

1- If all temp sensors across 3 hash boards (total of 12 temp sensors) show "failed" then the problem is the PSU
2- If one hash boards temp sensors show "failed" then one or more heatsink/chip isn't in 100% contact and needs replacement, and more often than not the first chip (chip 0) is the bad one

notice that, the PSU theory still stands even if 1 hash board is having a hard time reading the temp sensor, it's hard to explain but take it as is.