[Bug 1197422] New: [riscv64][Unmatched] Crash during boot
https://bugzilla.suse.com/show_bug.cgi?id=1197422 Bug ID: 1197422 Summary: [riscv64][Unmatched] Crash during boot Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: RISC-V OS: openSUSE Tumbleweed Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: hare@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Booting openSUSE-Tumbleweed-RISC-V-KDE-hifiveunmatched.riscv64-2022.03.08-Build2.3 on HiFive Unmatched crashes the system: Loading Linux 5.16.14-1-default ... Loading initial ramdisk ... EFI stub: Booting Linux Kernel... EFI stub: Using DTB from configuration table EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path EFI stub: Exiting boot services... [ 6.285566][ T55] Unable to handle kernel paging request at virtual address 00000000200b2f00 [ 6.293429][ T55] Oops [#1] [ 6.296377][ T55] Modules linked in: [ 6.300114][ T55] CPU: 3 PID: 55 Comm: kworker/u8:1 Not tainted 5.16.14-1-default #1 openSUSE Tumbleweed f3df1d20fd39837388447c5d684dd4708bd93651 [ 6.313315][ T55] Hardware name: SiFive HiFive Unmatched A00 (DT) [ 6.319570][ T55] Workqueue: efi_rts_wq efi_call_rts [ 6.324693][ T55] epc : 0x2003332c [ 6.328254][ T55] ra : 0x2003349e [ 6.331815][ T55] epc : 000000002003332c ra : 000000002003349e sp : ffffffd0041fbd00 [ 6.339720][ T55] gp : ffffffff81771228 tp : ffffffe080185400 t0 : 0000000000000040 [ 6.347623][ T55] t1 : 0000000000000003 t2 : ffffffff80ce5b38 s0 : ffffffd00400bd80 [ 6.355527][ T55] s1 : 00000000200b3328 a0 : ffffffd00400bda8 a1 : ffffffff80ed49f0 [ 6.363431][ T55] a2 : 0000000000000000 a3 : ffffffd00400bd80 a4 : ffffffd00400bd78 [ 6.371335][ T55] a5 : 0000000000000000 a6 : ffffffff80ed49f0 a7 : 0000000000000000 [ 6.379239][ T55] s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000 [ 6.387144][ T55] s5 : ffffffd00400bd80 s6 : ffffffd00400bd78 s7 : 0000000200000022 [ 6.395048][ T55] s8 : ffffffe0800d2d34 s9 : ffffffff80a13600 s10: ffffffff80a13600 [ 6.402952][ T55] s11: ffffffe0800d2cc0 t3 : 0000000000000116 t4 : 0000000000000001 [ 6.410855][ T55] t5 : ffffffff81619118 t6 : ffffffff81619120 [ 6.416849][ T55] status: 0000000200000120 badaddr: 00000000200b2f00 cause: 000000000000000d [ 6.425485][ T55] ---[ end trace 10f1eb5c3fa3d30c ]--- -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 Hannes Reinecke <hare@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |afaerber@suse.com, | |hare@suse.com, | |schwab@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c1 --- Comment #1 from Hannes Reinecke <hare@suse.com> --- Note that I didn't have a battery installed, and documentation said that without battery the real-time clock won't work. Seeing that it crashed in efi_rts_wq might it be that we crash due to a non-present battery/rtc? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c2 --- Comment #2 from Hannes Reinecke <hare@suse.com> --- After removing the usual systemd intercepts, the full log is: [ 2.354692][ T1] mousedev: PS/2 mouse device common for all mice [ 2.360556][ T1] poweroff-gpio gpio-poweroff: gpio_poweroff_probe: pm_power_off function already registered [ 2.370263][ T1] poweroff-gpio: probe of gpio-poweroff failed with error -16 [ 2.378116][ T1] EDAC DEVICE0: Giving out device to module Sifive ECC Manager controller sifive_edac.0: DEV sifive_edac.0 (INTERRUPT) [ 2.390116][ T1] ledtrig-cpu: registered to indicate activity on CPUs [ 2.396759][ T1] hid: raw HID events driver (C) Jiri Kosina [ 2.403215][ T1] NET: Registered PF_INET6 protocol family [ 6.185933][ T7] Freeing initrd memory: 131428K [ 6.260679][ T1] Segment Routing with IPv6 [ 6.264276][ T1] RPL Segment Routing with IPv6 [ 6.269046][ T1] In-situ OAM (IOAM) with IPv6 [ 6.274485][ T1] registered taskstats version 1 [ 6.278568][ T1] Loading compiled-in X.509 certificates [ 6.284118][ T1] Loaded X.509 cert 'openSUSE Secure Boot Signkey: 9ddf43d9f1a027273f52c6c0775908ee01671325' [ 6.294760][ T1] zswap: loaded using pool lzo/zbud [ 6.299945][ T1] Key type ._fscrypt registered [ 6.303871][ T1] Key type .fscrypt registered [ 6.308503][ T1] Key type fscrypt-provisioning registered [ 6.342409][ T1] Key type encrypted registered [ 6.346398][ T1] AppArmor: AppArmor sha1 policy hashing enabled [ 6.352555][ T55] Unable to handle kernel paging request at virtual address 00000000200b2f00 [ 6.361138][ T55] Oops [#1] [ 6.364065][ T55] Modules linked in: [ 6.367801][ T55] CPU: 1 PID: 55 Comm: kworker/u8:1 Not tainted 5.16.14-1-default #1 openSUSE Tumbleweed f3df1d20fd39837388447c5d684dd4708bd93651 [ 6.381002][ T55] Hardware name: SiFive HiFive Unmatched A00 (DT) [ 6.387257][ T55] Workqueue: efi_rts_wq efi_call_rts [ 6.392380][ T55] epc : 0x2003332c [ 6.395942][ T55] ra : 0x2003349e [ 6.399503][ T55] epc : 000000002003332c ra : 000000002003349e sp : ffffffd0041fbd00 [ 6.407407][ T55] gp : ffffffff81771228 tp : ffffffe080668000 t0 : 000000000116b777 [ 6.415310][ T55] t1 : 0000000000000001 t2 : ffffffff80c086a8 s0 : ffffffd00400bd80 [ 6.423214][ T55] s1 : 00000000200b3328 a0 : ffffffd00400bda8 a1 : ffffffff80ed49f0 [ 6.431118][ T55] a2 : 0000000000000000 a3 : ffffffd00400bd80 a4 : ffffffd00400bd78 [ 6.439022][ T55] a5 : 0000000000000000 a6 : ffffffff80ed49f0 a7 : 0000000000000000 [ 6.446926][ T55] s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000 [ 6.454830][ T55] s5 : ffffffd00400bd80 s6 : ffffffd00400bd78 s7 : 0000000200000022 [ 6.462735][ T55] s8 : ffffffe0801aaa34 s9 : ffffffff80a13600 s10: ffffffff80a13600 [ 6.470639][ T55] s11: ffffffe0801aa9c0 t3 : 0000000000000006 t4 : 0000000000000001 [ 6.478543][ T55] t5 : 00000000e0ccdeeb t6 : 010e5092454757b1 [ 6.484536][ T55] status: 0000000200000120 badaddr: 00000000200b2f00 cause: 000000000000000d [ 6.493169][ T55] ---[ end trace b57ffcbe59b03167 ]--- ... Not that it makes much of a difference. My suspicion here is that we're failing to parse the EFI variables provided by the pre-installed uboot. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c3 --- Comment #3 from Hannes Reinecke <hare@suse.com> --- Indeed; using 'efi=noruntime' makes the crash go away, but then I don't have a NVMe drive, either, so it's only of limited help as the system is installed on nvme ... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 Andreas F�rber <afaerber@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |chester.lin@suse.com, | |ddavis@suse.com, | |mbrugger@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c4 --- Comment #4 from Andreas Schwab <schwab@suse.de> --- Do you see the same crash with the image from <https://download.opensuse.org/repositories/home:/Andreas_Schwab:/riscv:/unmatched/images/>? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c5 --- Comment #5 from Andreas Schwab <schwab@suse.de> --- You should not use the preinstalled u-boot, but the one from the image. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c6 --- Comment #6 from Hannes Reinecke <hare@suse.com> --- (In reply to Andreas Schwab from comment #4)
Do you see the same crash with the image from <https://download.opensuse.org/repositories/home:/Andreas_Schwab:/riscv:/ unmatched/images/>?
Yes. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c7 --- Comment #7 from Hannes Reinecke <hare@suse.com> --- (In reply to Andreas Schwab from comment #5)
You should not use the preinstalled u-boot, but the one from the image.
Did so; copied the JeOS image (2022.03.18-Build26.5) to an SD card. U-Boot was not detected and the machine didn't even start to boot (ie no output on the serial console, LEDs not changing, etc.) Bit of a loss on what to do next. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c8 --- Comment #8 from Andreas Schwab <schwab@suse.de> --- What does "U-Boot was not detected" mean? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c9 --- Comment #9 from Chester Lin <chester.lin@suse.com> --- (In reply to Hannes Reinecke from comment #7)
(In reply to Andreas Schwab from comment #5)
You should not use the preinstalled u-boot, but the one from the image.
Did so; copied the JeOS image (2022.03.18-Build26.5) to an SD card. U-Boot was not detected and the machine didn't even start to boot (ie no output on the serial console, LEDs not changing, etc.)
Bit of a loss on what to do next.
The JeOS image from Andreas' subproject works for me. Not sure if you might experience the same HW glitch but the SD card slot of my board has a broken spring so the card always gets ejected [I temporarily solve this issue by using electrical tape]. Not sure whether it could be a common issue or not but I do see a report here: https://forums.sifive.com/t/sd-card-slot-broken/4754 Based on your descriptions [LED not changing] your board seems not able to reach U-BOOT SPL. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c10 --- Comment #10 from Hannes Reinecke <hare@suse.com> --- Not working means: zero output in the serial console. And the SD-card holder itself seems to work, as the provided SD card with the (SiFive) factory image works. It's just using a different SD card with Andreas' Tumbleweed image which gives no output. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c11 --- Comment #11 from Andreas Schwab <schwab@suse.de> --- Are you sure you are booting from SD card? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c12 --- Comment #12 from Hannes Reinecke <hare@suse.com> --- Positive. The SD card is the only device in the system, and exchanging the failing SD card with the manufacturer-provided SD card makes the system boot. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c13 --- Comment #13 from Andreas Schwab <schwab@suse.de> --- That doesn't prove that you are booting from it. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c14 --- Comment #14 from Hannes Reinecke <hare@suse.com> --- True; it just proves that I can't boot. But in the absence of any other boot device _and_ the fact that changing the SD card against the manufacturer-provided SD card in the same slot makes the system boot I would think the that system would have _tried_ to boot from it. Obviously that didn't work. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c15 Hannes Reinecke <hare@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #15 from Hannes Reinecke <hare@suse.com> --- But in either case: I've bought brand-new SD cards with the same make than the manufacturer-provided one (SanDisk Ultra 32GB), wrote the most current JeOS image to it, and, voila, system booted. So either the old card was dodgy to begin with (always a possibility, given that it's one I found in one of my drawers), or the Unmatched firmware is very picky about which SD card to accept (and the documentation hints at something like that). Anyway. System boots now. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c16 --- Comment #16 from Andreas Schwab <schwab@suse.de> --- The board can also boot from flash, which makes it possible to remove the SD card from the boot process. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1197422 https://bugzilla.suse.com/show_bug.cgi?id=1197422#c17 Andreas Schwab <schwab@suse.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|FIXED |INVALID --- Comment #17 from Andreas Schwab <schwab@suse.de> --- Not a bug. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com