[Bug 1208173] New: Can't boot devel:RISCV:Factory:Contrib:StarFive:VisionFive2/JeOS-starfivevisionfive2, "sbi_trap_error: hart0: trap handler failed (error -2)"
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 Bug ID: 1208173 Summary: Can't boot devel:RISCV:Factory:Contrib:StarFive:VisionFive2/JeOS- starfivevisionfive2, "sbi_trap_error: hart0: trap handler failed (error -2)" Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: RISC-V OS: openSUSE Tumbleweed Status: NEW Severity: Minor Priority: P5 - None Component: Bootloader Assignee: screening-team-bugs@suse.de Reporter: aaronpuchert@alice-dsl.net QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- This is not technically Tumbleweed, so I open with lower severity. This is on a VisionFive 2 V1.3B, with u-boot-spl.bin.normal.out and visionfive2_fw_payload.img from [1], and JeOS-starfivevisionfive2 on an SD card. The initial boot produces the error mentioned in the Wiki [2]: Loading Linux 6.2.0-rc7-12-default ... Loading initial ramdisk ... EFI stub: Booting Linux Kernel... Unhandled exception: Store/AMO access fault EPC: 00000000fff47a98 RA: 00000000fff8684a TVAL: 0000000040000000 EPC: 0000000040201a98 RA: 000000004024084a reloc adjusted [...] UEFI image [0x00000000fe460000:0x00000000fe716fff] '/efi\boot\bootriscv64.efi' UEFI image [0x00000000cb23a000:0x00000000ccf42fff] Ok, let's mark that region as reserved as the Wiki says: StarFive # fdt addr ${fdtcontroladdr}; fdt rsvmem add 0x40000000 0x00001000 StarFive # boot ## Warning: defaulting to text format ## Error: "boot2" not defined switch to partitions #0, OK mmc1 is current device Scanning mmc 1:1... libfdt fdt_check_header(): FDT_ERR_BADMAGIC Card did not respond to voltage select! : -110 ** Unable to read file ubootefi.var ** Failed to load EFI variables Found EFI removable media binary efi/boot/bootriscv64.efi 2846720 bytes read in 122 ms (22.3 MiB/s) libfdt fdt_check_header(): FDT_ERR_BADMAGIC Welcome to GRUB! [...] Loading Linux 6.2.0-rc7-12-default ... Loading initial ramdisk ... EFI stub: Booting Linux Kernel... EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path EFI stub: Using DTB from configuration table EFI stub: Exiting boot services... clk u5_dw_i2c_clk_core already disabled clk u5_dw_i2c_clk_apb already disabled sbi_trap_error: hart0: trap handler failed (error -2) sbi_trap_error: hart0: mcause=0x0000000000000005 mtval=0x0000000040047060 sbi_trap_error: hart0: mepc=0x0000000040004cac mstatus=0x0000000200001800 sbi_trap_error: hart0: ra=0x0000000040009ee2 sp=0x0000000040046f10 sbi_trap_error: hart0: gp=0x0000000000000000 tp=0x0000000040047000 sbi_trap_error: hart0: s0=0x0000000040046f20 s1=0x0000000040047000 sbi_trap_error: hart0: a0=0x0000000040047060 a1=0x0000000000000002 sbi_trap_error: hart0: a2=0x0000000000000000 a3=0x0000000000000019 sbi_trap_error: hart0: a4=0x0000000000000001 a5=0x0000000040047060 sbi_trap_error: hart0: a6=0x00000000400470a8 a7=0x0000000000000004 sbi_trap_error: hart0: s2=0x00000000400241a8 s3=0x0000000000000000 sbi_trap_error: hart0: s4=0x0000000000000000 s5=0x0000000040028000 sbi_trap_error: hart0: s6=0x0000000040028020 s7=0x0000000000000000 sbi_trap_error: hart0: s8=0x000000000000001c s9=0x0000000040034ab0 sbi_trap_error: hart0: s10=0x0000000000000000 s11=0x0000000000000000 sbi_trap_error: hart0: t0=0x0000000000000000 t1=0x0000000000000000 sbi_trap_error: hart0: t2=0x0000000000000000 t3=0x0000000000002000 sbi_trap_error: hart0: t4=0x0000000000000000 t5=0x0000000000000000 sbi_trap_error: hart0: t6=0x0000000000000000 and then it hangs. Now OpenSBI tells me this: Domain0 Region00 : 0x0000000002000000-0x000000000200ffff (I) Domain0 Region01 : 0x0000000040000000-0x000000004007ffff () Domain0 Region02 : 0x0000000000000000-0xffffffffffffffff (R,W,X) so maybe the region wasn't big enough? But a size of 0x80000 doesn't do it either, and the error is a different one. Booting the SDK kernel (e.g. image.fit from [1]) with the instructions given in the repository README (i.e. via tftpboot) works fine. Out of curiosity, I tried to switch roles: put the SDK kernel on the SD card, and tried tftpboot with the openSUSE kernel. The SDK kernel on the SD card produces pretty much the same sbi_trap_error, whereas the openSUSE kernel via tftp can at least start itself and some services. It's then missing things, because it expects a disk around, but that's expected: StarFive # setenv bootfile vmlinuz; setenv fileaddr a0000000; setenv fdtcontroladdr 0xffffffffffffffff StarFive # setenv kernel_comp_addr_r 0xb0000000; setenv kernel_comp_size 0x10000000; StarFive # tftpboot ${fdt_addr_r} jh7110-starfive-visionfive-2-vb.dtb [...] StarFive # tftpboot ${kernel_addr_r} Image-6.2.0-rc7-12-default [...] StarFive # tftpboot ${ramdisk_addr_r} initrd-6.2.0-rc7-12-default [...] StarFive # run chipa_set_linux StarFive # booti ${kernel_addr_r} ${ramdisk_addr_r}:${filesize} ${fdt_addr_r} ## Flattened Device Tree blob at 46000000 Booting using the fdt blob at 0x46000000 Using Device Tree in place at 0000000046000000, end 0000000046008140 Starting kernel ... [ 0.000000][ T0] Linux version 6.2.0-rc7-12-default (geeko@buildhost) (gcc (SUSE Linux) 12.2.1 20230124 [revision 193f7e62815b4089dfaed4c2bd34fd4f10209e27], GNU ld (GNU Binutils; openS) [ 0.000000][ T0] OF: fdt: Ignoring memory range 0x40000000 - 0x40200000 [ 0.000000][ T0] Machine model: StarFive VisionFive 2 VB [...] [ 0.000000][ T0] CPU with hartid=0 is not available [ 0.000000][ T0] CPU with hartid=0 is not available [ 0.000000][ T0] CPU with hartid=0 is not available [...] [ 19.230279][ T711] dracut: FATAL: No root device found [ 19.235549][ T711] dracut: Refusing to continue But that's expected. So I don't think there is anything wrong with the kernel. It seems to be an issue either with the firmware or with the bootloader. For what it's worth, I see that devel:RISCV:Factory:Contrib:StarFive:VisionFive2 has some kind of u-boot, but I can't find a pendant for visionfive2_fw_payload.img. Likely I'm just doing something wrong though and maybe the documentation could be improved a bit. [1] https://github.com/starfive-tech/VisionFive2/releases/tag/VF2_v2.8.0 [2] https://en.opensuse.org/HCL:VisionFive2 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 Aaron Puchert <aaronpuchert@alice-dsl.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |chester.lin@suse.com, | |schwab@suse.de -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c2 --- Comment #2 from Aaron Puchert <aaronpuchert@alice-dsl.net> --- (In reply to Chester Lin from comment #1)
Could you add 'earlycon=sbi' and then remove the 'loglevel' limit from the kernel command line in grub2? It could help us to check what happened since the earlycon is disabled by default. Thanks.
Produces the exact same output, down to the addresses. Also adding ignore_loglevel to the command line and removing splash=silent, just to be safe, so I have linux /boot/Image-6.2.0-rc7-12-default root=UUID=[...] earlycon=sbi ignore_loglevel systemd.show_status=1 console=ttyS0,115200n changes nothing. Adding "insmod progress" in GRUB gives me: Loading Linux 6.2.0-rc7-12-default ... Loading initial ramdisk ... [ Image-6.2.0-rc7-12-d 28.02MiB 100% 11.00MiB/s ] EFI stub: Booting Linux Kernel... [ initrd-6.2.0-rc7-12- 94.82MiB 100% 12.67MiB/s ] EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path [...] So kernel and initrd are completely loaded. I don't think it matters, but a couple of seconds after the sbi_trap_error I get i2c read: write daddr 36 to i2c read: write daddr 36 to i2c read: write daddr 36 to i2c read: write daddr 36 to i2c read: write daddr 36 to i2c read: write daddr 36 to i2c read: write daddr 36 to i2c read: write daddr 36 to i2c read: write daddr 36 to i2c read: write daddr 36 to cannot read pmic power register -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c3 --- Comment #3 from Aaron Puchert <aaronpuchert@alice-dsl.net> --- (In reply to Aaron Puchert from comment #0)
[ 19.230279][ T711] dracut: FATAL: No root device found [ 19.235549][ T711] dracut: Refusing to continue
But that's expected. So I don't think there is anything wrong with the kernel. It seems to be an issue either with the firmware or with the bootloader.
FWIW, with additionally setenv bootargs 'root=UUID=4303d3ef-5a6d-42f9-ab6b-85bcc7ffff87 console=ttyS0,115200 debug rootwait earlycon=sbi' I can fully boot the system with our 6.2.0-rc7-12-default via tftpboot. It doesn't find some devices, but I'll report that separately. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c5 Daniel Ekman <knegge@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |knegge@gmail.com --- Comment #5 from Daniel Ekman <knegge@gmail.com> --- (In reply to Chester Lin from comment #4)
load mmc 1:3 ${fdt_addr_r} /boot/dtb/jh7110-starfive-visionfive-2-vb.dtb load mmc 1:1 ${kernel_addr_r} /EFI/BOOT/bootriscv64.efi fdt addr ${fdt_addr_r} fdt rsvmem add 0x40000000 0x1000 bootefi ${kernel_addr_r} ${fdt_addr_r}
I can confirm this works on a vf2 1.2a with stock u-boot/img from 2.8.0 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c7 --- Comment #7 from Daniel Ekman <knegge@gmail.com> --- (In reply to Chester Lin from comment #6)
# load mmc 1:3 ${fdt_addr_r} /boot/dtb/<va-or-vb-dtb-file> # fdt addr ${fdt_addr_r}; fdt rsvmem add 0x40000000 0x00001000 # boot
Confirmed, also works. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c8 --- Comment #8 from Aaron Puchert <aaronpuchert@alice-dsl.net> --- (In reply to Chester Lin from comment #4)
It seems that preload FDT [fdtcontroladdr] owned by VF2 firmware blobs does not match the reviewing DTs in upstream, which is adopted by the contrib project.
Would this be solved by using a different firmware image?
Could you try the following instructions?
--- StarFive # load mmc 1:3 ${fdt_addr_r} /boot/dtb/jh7110-starfive-visionfive-2-vb.dtb 20801 bytes read in 8 ms (2.5 MiB/s) StarFive # load mmc 1:1 ${kernel_addr_r} /EFI/BOOT/bootriscv64.efi 2846720 bytes read in 123 ms (22.1 MiB/s) StarFive # fdt addr ${fdt_addr_r} StarFive # fdt rsvmem add 0x40000000 0x1000 StarFive # bootefi ${kernel_addr_r} ${fdt_addr_r} ---
Since the current image has not supported FW booting via SDIO, I will fix the manual installation in Wiki, thanks for your feedback!
This works, thanks! (In reply to Chester Lin from comment #6)
Thanks! Since the DISTRO_DEFAULTS feature has been enabled by VF2 u-boot, the simplified u-boot instructions are:
# load mmc 1:3 ${fdt_addr_r} /boot/dtb/<va-or-vb-dtb-file> # fdt addr ${fdt_addr_r}; fdt rsvmem add 0x40000000 0x00001000 # boot
Also works. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c10 --- Comment #10 from Aaron Puchert <aaronpuchert@alice-dsl.net> --- (In reply to Andreas Schwab from comment #9)
Please retry with the latest image. It now has an updated U-Boot included,
Used "zypper dup" instead of deploying a new image, but I suppose this should work as well? (Though not sure if you mean a firmware image, I'm still using the firmware from the SDK, now version 2.10.4.) Anyway, I can drop "fdt rsvmem add 0x40000000 0x00001000" now (presumably due to something like https://github.com/starfive-tech/u-boot/pull/41?), but still need to manually load the dtb at ${fdt_addr_r}.
so if you boot from SDIO you'll get something halfway booting.
Yeah, that describes it well for the new kernel (6.3.0-rc1-14-default). We're fine in the initrd, but after "Switch Root" I get a bunch of errors, then workqueue lockups in the kernel, then a recovery shell. But it's an rc1. However, the 6.2.1-12-default that I have still around boots totally fine. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c11 --- Comment #11 from Andreas Schwab <schwab@linux-m68k.org> --- You need the latest u-boot and the kernel was missing the mutex fix in sifive_errata_patch_func. Please try again with kernel-default-6.3~rc1-18.1. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c12 --- Comment #12 from Aaron Puchert <aaronpuchert@alice-dsl.net> --- (In reply to Andreas Schwab from comment #11)
You need the latest u-boot
So I can see that there is a package u-boot-starfivevisionfive2 with /boot/u-boot-spl.bin /boot/u-boot.itb Presumably I should flash the former (smaller) file in place of the SDK's u-boot-spl.bin.normal.out at 0x0 and the second in place of visionfive2_fw_payload.img at 0x100000? What is strange though is that the second file is a lot smaller than the SDK payload: -rw-r--r-- 1 root root 124221 Mar 8 11:45 /boot/u-boot-spl.bin -rw-r--r-- 1 root root 952985 Mar 8 11:45 /boot/u-boot.itb versus (from https://github.com/starfive-tech/VisionFive2/releases/tag/VF2_v2.10.4) -rw-r--r-- 1 aaron users 130688 Feb 28 07:37 u-boot-spl.bin.normal.out -rw-r--r-- 1 aaron users 2800453 Feb 28 07:37 visionfive2_fw_payload.img Though maybe there is just unneeded stuff in the SDK version, the file types seem to match ("file" says "Device Tree Blob version 17").
and the kernel was missing the mutex fix in sifive_errata_patch_func. Please try again with kernel-default-6.3~rc1-18.1.
That looks better, and also gives me network devices for the first time. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c13 --- Comment #13 from Aaron Puchert <aaronpuchert@alice-dsl.net> --- (In reply to Aaron Puchert from comment #12)
So I can see that there is a package u-boot-starfivevisionfive2 with
/boot/u-boot-spl.bin /boot/u-boot.itb
Presumably I should flash the former (smaller) file in place of the SDK's u-boot-spl.bin.normal.out at 0x0 and the second in place of visionfive2_fw_payload.img at 0x100000?
Probably not, /usr/share/doc/packages/u-boot-starfivevisionfive2-doc/starfive/visionfive2.rst says
u-boot-spl.bin cannot be used directly on StarFive VisionFive2,we need to convert the u-boot-spl.bin to u-boot-spl.bin.normal.out with the below command:
./spl_tool -c -f $(Uboot_PATH)/spl/u-boot-spl.bin
Do we also ship this script somewhere? Or might we directly ship u-boot-spl.bin.normal.out? Later it suggests
sf probe fatload mmc 1:3 $kernel_addr_r u-boot.itb sf update $kernel_addr_r 0x100000 $filesize
fatload mmc 1:3 $kernel_addr_r u-boot-spl.bin.normal.out sf update $kernel_addr_r 0x0 $filesize
So u-boot.itb can apparently replace visionfive2_fw_payload.img despite being a lot smaller? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c15 Aaron Puchert <aaronpuchert@alice-dsl.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #15 from Aaron Puchert <aaronpuchert@alice-dsl.net> --- (In reply to Andreas Schwab from comment #14)
u-boot-spl.bin.normal.out is part of the visionfive2-firmware package.
That seems to do it, whereas u-boot-starfivevisionfive2 doesn't seem to be necessary. Here is what I did after installing visionfive2-firmware: sf probe load mmc 1:3 0xa0000000 /usr/share/visionfive2-firmware/u-boot-spl.bin.normal.out sf update 0xa0000000 0x0 $filesize load mmc 1:3 0xa0000000 /usr/share/visionfive2-firmware/u-boot.itb sf update 0xa0000000 0x100000 $filesize The firmware prints no banner, but then boots without additional changes. Thanks! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1208173 http://bugzilla.opensuse.org/show_bug.cgi?id=1208173#c16 --- Comment #16 from Aaron Puchert <aaronpuchert@alice-dsl.net> --- Maybe this should be mentioned in https://en.opensuse.org/HCL:VisionFive2 though. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com