2 kerney666
Hi, my rig can't run longer then 1-2 hours on v 0.4.5 and newer. One GPU hangs (this is from 0.5.1 after autoconfig):
[2019-06-10 21:22:55] GPU 0 [57C, fan 87%] cnr: 2.486kh/s, avg 2.477kh/s, pool 2.709kh/s a:46 r:0 hw:0
[2019-06-10 21:22:55] GPU 1 [58C, fan 85%] cnr: 2.490kh/s, avg 2.482kh/s, pool 1.261kh/s a:22 r:0 hw:0
[2019-06-10 21:22:55] GPU 2 [56C, fan 86%] cnr: 2.486kh/s, avg 2.478kh/s, pool 2.969kh/s a:51 r:0 hw:1
[2019-06-10 21:22:55] GPU 3 [63C, fan 87%] cnr: 2.480kh/s, avg 2.468kh/s, pool 2.377kh/s a:42 r:0 hw:0
[2019-06-10 21:22:55] GPU 4 [45C, fan 90%] cnr: 2.483kh/s, avg 2.471kh/s, pool 2.737kh/s a:47 r:0 hw:3
[2019-06-10 21:22:55] Total cnr: 12.42kh/s, avg 12.38kh/s, pool 12.05kh/s a:208 r:0 hw:4
[2019-06-10 21:23:05] GPU 4: detected DEAD (11:00.0), will execute restart script watchdog.sh
but v0.4.4 with exactly same config can run for weeks:
[2019-06-10 19:40:27] Stats Uptime: 13 days, 12:58:15
[2019-06-10 19:40:27] GPU 0 [59C, fan 87%] cnr: 2.471kh/s, avg 2.470kh/s, pool 2.340kh/s a:7780 r:0 hw:17
[2019-06-10 19:40:27] GPU 1 [60C, fan 85%] cnr: 2.474kh/s, avg 2.473kh/s, pool 2.423kh/s a:8057 r:0 hw:39
[2019-06-10 19:40:27] GPU 2 [57C, fan 86%] cnr: 2.471kh/s, avg 2.471kh/s, pool 2.337kh/s a:7768 r:0 hw:106
[2019-06-10 19:40:27] GPU 3 [63C, fan 87%] cnr: 2.464kh/s, avg 2.464kh/s, pool 2.381kh/s a:7917 r:0 hw:73
[2019-06-10 19:40:27] GPU 4 [55C, fan 88%] cnr: 2.470kh/s, avg 2.468kh/s, pool 2.259kh/s a:7519 r:0 hw:321
[2019-06-10 19:40:27] Total cnr: 12.35kh/s, avg 12.35kh/s, pool 11.74kh/s a:39041 r:0 hw:556
[2019-06-10 19:40:39] Pool pool.supportxmr.com received new job. (job_id: +kOSIEF95a5dlkxX6slHR0EW+l34)
I know, I'm pushing hard on limit, but what changed in TR miner 0.4.5 that causes this instability? With 0.4.4 and older was this rig rock stable. OS is Linux & amd18.3 drivers.
Thank you for answer,
Migo
Hi!
Man, it's such a hard question to answer. The changes between 0.4.4 and 0.4.5 are really tiny, and nothing that "should" affect anything in terms of stability. For cn/r, absolutely nothing of interest was touched in the kernels, and not anything specific in the host-side code either. For every release, we get a few people telling us how stable things are with the new version, then a others that (like you) unfortunately have a harder time keeping things running smoothly.
Since you're running linux, do you see anything interesting in your "dmesg" output from the kernel when a crash occurs?
-- K
Hi, thank you for your answer! I'm sorry I've not looked into dmesg. I'll stop 0.4.4 and run 0.5.1 again to look at dmesg. Can I provide some more info after crash?
0.4.4 run nicely from last 0.5.1 experiment yesterday:
[2019-06-11 18:16:28] Stats Uptime: 0 days, 20:38:06
[2019-06-11 18:16:28] GPU 0 [59C, fan 87%] cnr: 2.470kh/s, avg 2.470kh/s, pool 2.443kh/s a:512 r:0 hw:2
[2019-06-11 18:16:28] GPU 1 [60C, fan 84%] cnr: 2.470kh/s, avg 2.475kh/s, pool 2.537kh/s a:532 r:0 hw:0
[2019-06-11 18:16:28] GPU 2 [56C, fan 85%] cnr: 2.468kh/s, avg 2.471kh/s, pool 2.303kh/s a:482 r:0 hw:8
[2019-06-11 18:16:28] GPU 3 [63C, fan 87%] cnr: 2.461kh/s, avg 2.464kh/s, pool 2.401kh/s a:503 r:0 hw:7
[2019-06-11 18:16:28] GPU 4 [55C, fan 88%] cnr: 2.464kh/s, avg 2.468kh/s, pool 2.314kh/s a:485 r:0 hw:26
[2019-06-11 18:16:28] Total cnr: 12.33kh/s, avg 12.35kh/s, pool 12.00kh/s a:2514 r:0 hw:43
[2019-06-11 18:16:30] Pool pool.supportxmr.com received new job. (job_id: BI3HJirVchNMe6LpNGRZuX5bez1a)
Now I'm on 0.5.1 for debuging:
Team Red Miner version 0.5.1
[2019-06-11 18:18:19] Auto-detected AMD OpenCL platform 0
[2019-06-11 18:18:20] Initializing GPU 0.
[2019-06-11 18:18:21] Initializing GPU 1.
[2019-06-11 18:18:22] Initializing GPU 2.
[2019-06-11 18:18:23] Initializing GPU 3.
[2019-06-11 18:18:24] Initializing GPU 4.
[2019-06-11 18:18:25] Watchdog thread starting.
[2019-06-11 18:18:25] Runtime Command Keys: h - help, s - stats, e - enable gpu, d - disable gpu, t - tuning mode, q - quit
[2019-06-11 18:18:25] API initialized on 127.0.0.1:4028
[2019-06-11 18:18:25] Successfully initialized GPU 0: Vega with 64 CU (PCIe 03:00.0) (CN 16*14:CAA)
[2019-06-11 18:18:25] Successfully initialized GPU 1: Vega with 64 CU (PCIe 08:00.0) (CN 16*14:CAA)
[2019-06-11 18:18:25] Successfully initialized GPU 2: Vega with 64 CU (PCIe 0b:00.0) (CN 16*14:CAA)
[2019-06-11 18:18:25] Successfully initialized GPU 3: Vega with 64 CU (PCIe 0e:00.0) (CN 16*14:CAA)
[2019-06-11 18:18:25] Successfully initialized GPU 4: Vega with 64 CU (PCIe 11:00.0) (CN 16*14:CAA)
Thank you,
Migo