Post
Topic
Board Bitcoin Technical Support
Re: Crash without error during IBD
by
kemist
on 13/07/2022, 18:05:27 UTC
Code:
$ sudo dmesg --level=warn
[sudo] password:
[    0.157708] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
[    0.157782]  #2
[    0.161743]  #3
[    0.166408] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[    0.243206] pnp 00:00: disabling [mem 0x000c0000-0x000c3fff] because it overlaps 0000:00:02.0 BAR 6 [mem 0x000c0000-0x000dffff]
[    0.243212] pnp 00:00: disabling [mem 0x000c4000-0x000c7fff] because it overlaps 0000:00:02.0 BAR 6 [mem 0x000c0000-0x000dffff]
[    0.243214] pnp 00:00: disabling [mem 0x000c8000-0x000cbfff] because it overlaps 0000:00:02.0 BAR 6 [mem 0x000c0000-0x000dffff]
[    0.243217] pnp 00:00: disabling [mem 0x000cc000-0x000cffff] because it overlaps 0000:00:02.0 BAR 6 [mem 0x000c0000-0x000dffff]
[    0.243219] pnp 00:00: disabling [mem 0x000d0000-0x000d3fff] because it overlaps 0000:00:02.0 BAR 6 [mem 0x000c0000-0x000dffff]
[    0.243221] pnp 00:00: disabling [mem 0x000d4000-0x000d7fff] because it overlaps 0000:00:02.0 BAR 6 [mem 0x000c0000-0x000dffff]
[    0.243223] pnp 00:00: disabling [mem 0x000d8000-0x000dbfff] because it overlaps 0000:00:02.0 BAR 6 [mem 0x000c0000-0x000dffff]
[    0.243225] pnp 00:00: disabling [mem 0x000dc000-0x000dffff] because it overlaps 0000:00:02.0 BAR 6 [mem 0x000c0000-0x000dffff]
[    0.384632] device-mapper: core: CONFIG_IMA_DISABLE_HTABLE is disabled. Duplicate IMA measurements will not be recorded in the IMA log.
[    0.384775] platform eisa.0: EISA: Cannot allocate resource for mainboard
[    0.384776] platform eisa.0: Cannot allocate resource for EISA slot 1
[    0.384778] platform eisa.0: Cannot allocate resource for EISA slot 2
[    0.384779] platform eisa.0: Cannot allocate resource for EISA slot 3
[    0.384781] platform eisa.0: Cannot allocate resource for EISA slot 4
[    0.384782] platform eisa.0: Cannot allocate resource for EISA slot 5
[    0.384784] platform eisa.0: Cannot allocate resource for EISA slot 6
[    0.384785] platform eisa.0: Cannot allocate resource for EISA slot 7
[    0.384787] platform eisa.0: Cannot allocate resource for EISA slot 8
[    1.257330] acpi PNP0C14:01: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[    1.259352] ACPI Warning: SystemIO range 0x0000000000000428-0x000000000000042F conflicts with OpRegion 0x0000000000000400-0x000000000000047F (\_SB.PCI0.LPC.PMIO) (20210730/utaddress-204)
[    1.259371] ACPI Warning: SystemIO range 0x0000000000000540-0x000000000000054F conflicts with OpRegion 0x0000000000000500-0x000000000000057F (\_SB.PCI0.LPC.LPIO) (20210730/utaddress-204)
[    1.259381] ACPI Warning: SystemIO range 0x0000000000000530-0x000000000000053F conflicts with OpRegion 0x0000000000000500-0x000000000000057F (\_SB.PCI0.LPC.LPIO) (20210730/utaddress-204)
[    1.259389] ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052F conflicts with OpRegion 0x0000000000000500-0x000000000000057F (\_SB.PCI0.LPC.LPIO) (20210730/utaddress-204)
[    1.259396] lpc_ich: Resource conflict(s) found affecting gpio_ich
[    1.659456] ata1.00: ATA Identify Device Log not supported
[    1.662039] ata1.00: ATA Identify Device Log not supported
[   18.959770] iwlwifi 0000:03:00.0: can't disable ASPM; OS doesn't have ASPM control
[   19.018011] at24 0-0050: supply vcc not found, using dummy regulator
[   46.572446] kauditd_printk_skb: 28 callbacks suppressed
[241385.357572] unchecked MSR access error: WRMSR to 0x48 (tried to write 0x0000000000000000) at rIP: 0xffffffffb9090b34 (native_write_msr+0x4/0x20)
[241385.357587] Call Trace:
[241385.357589]  <TASK>
[241385.357590]  ? __restore_processor_state.constprop.0+0x179/0x200
[241385.357598]  restore_processor_state+0x9/0x10
[241385.357602]  x86_acpi_suspend_lowlevel+0x133/0x1a0
[241385.357605]  acpi_suspend_enter+0x56/0x1c0
[241385.357609]  suspend_enter+0x28f/0x340
[241385.357613]  suspend_devices_and_enter+0x12b/0x240
[241385.357617]  enter_state+0x1d2/0x430
[241385.357620]  pm_suspend+0x4e/0xc0
[241385.357623]  state_store+0x81/0xe0
[241385.357628]  kobj_attr_store+0x12/0x20
[241385.357631]  sysfs_kf_write+0x3e/0x50
[241385.357635]  kernfs_fop_write_iter+0x137/0x1c0
[241385.357638]  new_sync_write+0x117/0x1a0
[241385.357644]  vfs_write+0x1cd/0x260
[241385.357647]  ksys_write+0x67/0xe0
[241385.357649]  __x64_sys_write+0x19/0x20
[241385.357651]  do_syscall_64+0x5c/0xc0
[241385.357654]  ? exit_to_user_mode_prepare+0x37/0xb0
[241385.357658]  ? syscall_exit_to_user_mode+0x27/0x50
[241385.357661]  ? __x64_sys_sendmsg+0x1d/0x20
[241385.357665]  ? do_syscall_64+0x69/0xc0
[241385.357667]  ? do_user_addr_fault+0x1e3/0x670
[241385.357670]  ? exit_to_user_mode_prepare+0x37/0xb0
[241385.357673]  ? syscall_exit_to_user_mode+0x27/0x50
[241385.357675]  ? __do_sys_gettid+0x1b/0x20
[241385.357678]  ? do_syscall_64+0x69/0xc0
[241385.357680]  ? asm_exc_page_fault+0x8/0x30
[241385.357684]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[241385.357688] RIP: 0033:0x7f2d9e2e0a37
[241385.357692] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[241385.357694] RSP: 002b:00007fff8e6b6c28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[241385.357697] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f2d9e2e0a37
[241385.357699] RDX: 0000000000000004 RSI: 00007fff8e6b6ce0 RDI: 0000000000000004
[241385.357700] RBP: 00007fff8e6b6ce0 R08: 0000000000000004 R09: 000000007fffffff
[241385.357701] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
[241385.357703] R13: 000055b1b723f2d0 R14: 00007f2d9e3e1a00 R15: 0000000000000004
[241385.357705]  </TASK>
[241386.469659] done.
[241387.735718] ata1.00: ATA Identify Device Log not supported
[241387.738884] ata1.00: ATA Identify Device Log not supported

Code:
$ sudo dmesg --level=err,crit,alert,emerg
[    0.148658] x86/cpu: VMX (outside TXT) disabled by BIOS
[    0.877661] ima: Error Communicating to TPM chip
[    0.885649] ima: Error Communicating to TPM chip
[    0.889733] ima: Error Communicating to TPM chip
[    0.897746] ima: Error Communicating to TPM chip
[    0.905752] ima: Error Communicating to TPM chip
[    0.913732] ima: Error Communicating to TPM chip
[    0.921730] ima: Error Communicating to TPM chip
[    0.929731] ima: Error Communicating to TPM chip
[   12.923495] mtd device must be supplied (device name is empty)
[   23.295269] mtd device must be supplied (device name is empty)
[   27.902828] mtd device must be supplied (device name is empty)
[241386.467654] ACPI: \_SB_.PCI0.LPC_.EC__.BAT1: Unable to dock!

Quote
While it might be warning message that could be ignored, could you check cropped part of line "ata1.00: ATA Identify Devi"?

It's the same as what's in the dmesg (this is from yesterday's run, so different timestamp):

Code:
Jul 12 11:15:40 ThinkPad kernel: ata1.00: ATA Identify Device Log not supported
Jul 12 11:15:40 ThinkPad kernel: ata1.00: ATA Identify Device Log not supported


Here are the journalctl results.

That doesn't tell much, but at least we know it's not crashed/closed due to OOM Killer. Could you check result of this dmesg command since i agree with others it might be hardware failure.

Code:
sudo dmesg --level=warn
sudo dmesg --level=err,crit,alert,emerg

Note that I tried to re-run it last night with slightly different parameters:

Code:
$ bitcoind -prune=20000 -maxmempool=500 -reindex -daemon

FYI, the default maxmempool is only 300 (300 MB) and it's not used during sync/reindex.

Code:
Jul 12 11:15:40 ThinkPad kernel: ata1.00: ATA Identify Devi>
Jul 12 11:15:40 ThinkPad kernel: ata1.00: ATA Identify Devi>