[Bug 1207562] New: U-Boot: Raspberry Pi 4 booting from USB stopped booting
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 Bug ID: 1207562 Summary: U-Boot: Raspberry Pi 4 booting from USB stopped booting Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: aarch64 OS: openSUSE Tumbleweed Status: NEW Severity: Major Priority: P5 - None Component: Bootloader Assignee: screening-team-bugs@suse.de Reporter: boris@pruessmann.org QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- I have a set of Raspberry Pi 4 booting from USB sticks that all got stuck after the latest automatic MicroOS update. While U-Boot was reporting several timeouts, the thing that stuck out to me was ** Reading file would overwrite reserved memory ** Coincidentally, I had been looking into the exact same issue yesterday, trying to get a CM4 to boot from NVMe. Swapping in the u-boot I compiled yesterday for that set up, I was able to "resurrect" all those systems. Root cause: Apparently, u-boot in its current configuration - maybe related to some hardware or RPi firmware specifics that I am not aware of - is already reporting 8 reserved regions. While many configurations for other systems have CONFIG_LMB_MAX_REGIONS set to e.g. 64, the configurations for Raspberry Pis don't. The error message is wrong, loading the EFI would not overwrite any reserved memory region. It's just that there is no space left to reserve a new one. Fix: For me, increasing CONFIG_LMB_MAX_REGIONS to 64 solved the problem. However, there might be other things at play here since everything worked correctly before the latest update. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c4 --- Comment #4 from Pruessmann <boris@pruessmann.org> --- Would love to verify this, but I get a 404 "Resource is no longer available!" when trying to download u-boot-rpiarm64-2023.01-244.1.aarch64.rpm -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c5 Guillaume GARDET <guillaume.gardet@arm.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|IN_PROGRESS |RESOLVED Resolution|--- |FIXED Assignee|screening-team-bugs@suse.de |guillaume.gardet@arm.com Flags|needinfo?(boris@pruessmann. | |org) | --- Comment #5 from Guillaume GARDET <guillaume.gardet@arm.com> --- A fix is on the way to Factory project: https://build.opensuse.org/request/show/1061487 In the mean time, you can use the RPM package from the hardware:boot:staging repo: https://download.opensuse.org/repositories/hardware:/boot:/staging/openSUSE_... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c6 --- Comment #6 from Pruessmann <boris@pruessmann.org> --- Can also confirm this works. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c8 --- Comment #8 from Guillaume GARDET <guillaume.gardet@arm.com> --- FTR, upstream has a different fix: https://patchwork.ozlabs.org/project/uboot/patch/20230125230823.1567778-1-tr... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c9 --- Comment #9 from Guillaume GARDET <guillaume.gardet@arm.com> --- New SR using upstream solution: https://build.opensuse.org/request/show/1061570 This will fix other platforms as well, not only RPi4. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c16 --- Comment #16 from Pruessmann <boris@pruessmann.org> ---
It seemed that somehow the load address [should be 0x00080000 = kernel_addr_r] > had been *reserved* so that the 'boot_efi_binary' script failed to load the grub2 from the USB disk.
Given that "** Reading file would overwrite reserved memory **" is logged out for all kinds of different reasons (see initial comment), wouldn't it make sense to get the output of "bdinfo" to get a dump of the reserved regions and their count? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c17 --- Comment #17 from Stefan Seyfried <seife@novell.slipkontur.de> --- I can try different things, but I'd need instructions what to do where and when. My U-boot knowledge is from embedded systems that do not use UEFI stuff at all ;-) and from a time when there were no such things as "device trees" but instead separate kernels for different hardware configurations were built :-) So what I wanted to say... I don't even know what a "runtime FDT" is, much less how I would check if it declares reserved memory ;-) The bdinfo would need to come from the "broken" U-boot? or can I use a "good", working version? The problem with the broken one is that it is already in the TFTP BOOTP stage once the HDMI output stabilizes and once it is there, no keyboard input seems to do anything at all. So I'd need to set up a serial console etc and since this would be possible, it's quite some effort here and I'd leave that experiment to someone else experiencing this issue who already might have everything set up. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c21 --- Comment #21 from Pruessmann <boris@pruessmann.org> --- Not sure if this helps but I just realized that I booted with another RPi 4 for recovery when I encountered the problem initially. Tried again with the latest U-Boot and I does boot with no problems. However, this one was running a rather old RPi firmware:
raspberrypi:~$ sudo rpi-eeprom-update *** UPDATE AVAILABLE *** BOOTLOADER: update available CURRENT: Thu Sep 3 12:11:43 UTC 2020 (1599135103) LATEST: Tue Apr 26 10:24:28 UTC 2022 (1650968668)
After updating, the boot process hangs as reported here in the ticket. Let me know if you need more information. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c22 --- Comment #22 from Pruessmann <boris@pruessmann.org> --- Okay, this problem is still related to the amount of reserved regions available, as far as I can tell. While setting CONFIG_LMB_MAX_REGIONS to 64 does not fix this anymore - maybe related to the introduction of the upstream fix - setting CONFIG_LMB_RESERVED_REGIONS to 16 does get this to boot again. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c23 --- Comment #23 from Pruessmann <boris@pruessmann.org> --- FWIW, the version available at https://build.opensuse.org/package/show/home:docbobo/u-boot works for me (though it does include some additional NVMe related patches I need for something else) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c24 --- Comment #24 from Trompert <suse@ivotrompert.nl> --- I can verify that the new u-boot.bin from @Pruessmann is working. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 Blake Burkhart <bburky@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bburky@gmail.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 Chris Ellis <chris@intrbiz.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |chris@intrbiz.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c25 --- Comment #25 from Chris Ellis <chris@intrbiz.com> --- I've ran into the same issue on my RPi CM4 cluster, with all my 8GB models now failing to boot. Running bdinfo on the 4GB and 8GB variants, show no space in the 8GB LMBs: 8GB: lmb_dump_all: memory.cnt = 0x3 memory[0] [0x0-0x3dffffff], 0x3e000000 bytes flags: 0 memory[1] [0x40000000-0xfbffffff], 0xbc000000 bytes flags: 0 memory[2] [0x100000000-0x1ffffffff], 0x100000000 bytes flags: 0 reserved.cnt = 0x6 reserved[0] [0x0-0x7ffff], 0x00080000 bytes flags: 4 reserved[1] [0x3cb20000-0x3dffffff], 0x014e0000 bytes flags: 0 reserved[2] [0x3db26030-0x3dffffff], 0x004d9fd0 bytes flags: 0 reserved[3] [0x3ef62aa0-0x3ef62bd0], 0x00000131 bytes flags: 4 reserved[4] [0x40000000-0xfbffffff], 0xbc000000 bytes flags: 0 reserved[5] [0x100000000-0x1ffffffff], 0x100000000 bytes flags: 0 4GB: lmb_dump_all: memory.cnt = 0x2 memory[0] [0x0-0x3dffffff], 0x3e000000 bytes flags: 0 memory[1] [0x40000000-0xfbffffff], 0xbc000000 bytes flags: 0 reserved.cnt = 0x5 reserved[0] [0x0-0x7ffff], 0x00080000 bytes flags: 4 reserved[1] [0x3cb20000-0x3dffffff], 0x014e0000 bytes flags: 0 reserved[2] [0x3db26050-0x3dffffff], 0x004d9fb0 bytes flags: 0 reserved[3] [0x3ef62aa0-0x3ef62af5], 0x00000056 bytes flags: 4 reserved[4] [0x40000000-0xfbffffff], 0xbc000000 bytes flags: 0 It seems that request https://build.opensuse.org/request/show/1061570 introduced these issues into Tumbleweed / Microos. It looks like: https://github.com/openSUSE/u-boot/commit/f4dbc6d532d05bdb3070c6492d19e26884... backported some issues from upstream. Looking at upstream, that commit has been reverted: https://github.com/u-boot/u-boot/commit/948d3999bfcdf91ef7a10c3eee9c763ed132... And there have been a number of commits which look like we need to backport: https://github.com/u-boot/u-boot/commit/1975a3b1f66e27ec9004213bb3d8c5f2f67b... https://github.com/u-boot/u-boot/commit/2dc16a2c1f924985216b3f1d6710f96d6c4f... https://github.com/u-boot/u-boot/commit/d1f5dbe6645ad51e318dd322033fe6a08bce... It would be good if we could revert: https://build.opensuse.org/request/show/1061570 in the short term. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 Aaron Rumpler <aaron@aaronrumpler.nz> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |aaron@aaronrumpler.nz -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c39 David Walker <David@WalkerStreet.info> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |David@WalkerStreet.info --- Comment #39 from David Walker <David@WalkerStreet.info> --- Build 251 works for me, too, on one of my three 8GB Rpi4b's. I'll try the other two tomorrow (Pacific time zone). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c40 --- Comment #40 from David Walker <David@WalkerStreet.info> --- I can confirm that build 251 works on all three of my 8GB Rpi4b's. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1207562 http://bugzilla.opensuse.org/show_bug.cgi?id=1207562#c46 Guillaume GARDET <guillaume.gardet@arm.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |IN_PROGRESS Assignee|ivan.ivanov@suse.com |guillaume.gardet@arm.com --- Comment #46 from Guillaume GARDET <guillaume.gardet@arm.com> --- Here is the new update to Factory: https://build.opensuse.org/request/show/1066736 using the 2 new patches from upstream: * lmb: Treat a region which is a subset as equal https://source.denx.de/u-boot/u-boot/-/commit/0d91c88230fe8bd9f8d39ca2ab69cd... * Bump LMB_MAX_REGIONS default to 16 https://source.denx.de/u-boot/u-boot/-/commit/2dc16a2c1f924985216b3f1d6710f9... I hope those 2 patches will be finally enough. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com