Bug ID 1209736
Summary MicroOS only boots in recoverymode after adding device to btrfs filesystem and rebuilding initramfs
Classification openSUSE
Product openSUSE Tumbleweed
Version Current
Hardware Other
OS Other
Status NEW
Severity Normal
Priority P5 - None
Component MicroOS
Assignee kubic-bugs@opensuse.org
Reporter vortex@z-ray.de
QA Contact qa-bugs@suse.de
Found By ---
Blocker ---

Created attachment 865838 [details]
journal after 26nd March 2023

Hello there,
this is kind of a difficult issue to narrow down but easy to reproduce.

Steps to reproduce:
- run "sudo btrfs device add /dev/sdxn /home"
- wait for an update to execute dracut to rebuild the initramfs or manually
using a transactional-shell with dracut -f
- Reboot

Observed behaviour:
- GRUB2 complains about: error: ../../grub-core/commands/loadenv.c:113: invalid
environment block
- Boot gets stuck at a black screen
- No logs during the failing boot attempts
- Recovery mode boots normally with out the GRUB message and produces logs
- The previous snapshot before the initramfs rebuild still runs fine to this
day also without recovery mode but also shows the GRUB error now.

Here is the situation:

After adding a device to the btrfs filesystem as described above and then
rebuilding the initramfs makes the system unable to boot unless you select
recovery mode from GRUB.

What makes it really hard to narrow down this issue is that there are no logs
generated during the normal boot:

sudo journalctl --list-boots 
IDX BOOT ID                          FIRST ENTRY                  LAST ENTRY    
-17 3bbd492bb2054e658544344ba559f2fc Wed 2023-03-22 12:50:09 CET  Wed
2023-03-22 13:07:49 CET
-16 68a82d45e1f946959b8a7511cff13fd8 Wed 2023-03-22 13:08:23 CET  Wed
2023-03-22 13:12:15 CET
-15 bea204d500f642628b9914a41af22b55 Wed 2023-03-22 13:12:47 CET  Wed
2023-03-22 17:22:41 CET
-14 7af1b8a760c045beb852a9ffa62313d2 Wed 2023-03-22 17:23:14 CET  Wed
2023-03-22 17:39:27 CET
-13 25064359d7dc496f936a55d6259c94ed Wed 2023-03-22 17:39:50 CET  Thu
2023-03-23 01:00:54 CET
-12 f3b74bb09e4f42cd99a6e24101698005 Thu 2023-03-23 01:01:24 CET  Thu
2023-03-23 01:04:04 CET
-11 b8bb60cedaa347f3ba8b8aa961ce99d9 Thu 2023-03-23 10:49:14 CET  Thu
2023-03-23 13:40:00 CET
-10 b52ee0b220f04920aac861d77ef155e1 Thu 2023-03-23 13:40:26 CET  Thu
2023-03-23 15:50:44 CET
 -9 8bf5f33fe1104124a625a2f05a48b0b7 Thu 2023-03-23 16:52:27 CET  Fri
2023-03-24 00:54:55 CET
 -8 cce1655614dc4a53bc711144b1eb94d8 Fri 2023-03-24 10:38:33 CET  Fri
2023-03-24 17:27:27 CET
 -7 07a857748896466a84f5b9663ad7ee27 Fri 2023-03-24 20:56:35 CET  Fri
2023-03-24 21:10:30 CET
 -6 1701c711b38849afa2727d72648027f3 Fri 2023-03-24 21:43:41 CET  Sat
2023-03-25 00:32:37 CET
 -5 3a9341fe48014659b8828fe7e2b72815 Sat 2023-03-25 12:27:56 CET  Sat
2023-03-25 20:49:47 CET
 -4 077ba7b889804822a7e553c8130cf178 Sat 2023-03-25 20:56:58 CET  Sat
2023-03-25 21:02:25 CET
 -3 a057d1da74bb432fa8d00cf13fd471fa Sat 2023-03-25 21:09:28 CET  Sat
2023-03-25 23:42:16 CET
 -2 5312d2840a754a00b0bd854ebe99321c Sat 2023-03-25 23:45:28 CET  Sat
2023-03-25 23:56:49 CET
 -1 7100209c9fcd4e27b52e601c54c84854 Sun 2023-03-26 11:57:58 CEST Sun
2023-03-26 12:39:35 CEST
  0 f9a79721d7bd433b8af5028500b32422 Sun 2023-03-26 12:43:50 CEST Sun
2023-03-26 12:45:59 CEST

I normally booted the system at 11:50 on March 26, 12:40 and 12:42 to get logs
but as you can see in the output above only my boot attempts in recovery mode
at 11:57 and 12:43 got logged.

Looking at the logs from the last transactional update which partially broke
the boot process I couldn't spot anything wrong but I'll add it in case it
still holds valuable information.
Also I'll add my entire logs after the 26nd March 2023 (today) in case the
recovery mode logs do tell anything usefull.

Additional changes I did to the default MicroOS image which might or might not
be related are:
- Installed proprietary nvidia drivers
- added the follwing to my modprobe to get Wayland on nvidia:
  - options nvidia_drm modeset=1
  - options nvidia NVreg_PreserveVideoMemoryAllocations=1
- Added sane-backends and simple-scan
- set the following SE Bools:
  - sudo setsebool -P selinuxuser_execmod 1
  - sudo setsebool -P selinuxuser_execstack 1
Note: All those changes where made before adding additional devices to the
btrfs FS and worked fine since 9th November 2022.
And also after a re-installtion of the OS at 22nd March 2023 also before adding
the devices to the FS.

My current btrfs filesystem layout:

btrfs filesystem show
Label: none  uuid: 93f66fac-795c-4522-bb9c-7e3b99c54e3d
    Total devices 4 FS bytes used 3.00TiB
    devid    1 size 931.01GiB used 709.01GiB path /dev/nvme0n1p2
    devid    2 size 931.51GiB used 710.00GiB path /dev/sda1
    devid    3 size 931.51GiB used 709.00GiB path /dev/sdb1
    devid    5 size 3.64TiB used 950.06GiB path /dev/sdc1

device 4 was removed as I was about to narrow down the issue in an attempt do
restore the single device btrfs file system structure but skipped this for now
to properly write this bug report.

Additionally I'd like to mention I added the additional devices after the
initial installation of MicroOS.
What also makes me kinda wonder is the following situation when running "btrfs
device usage /home" or "btrfs device usage /" both show the same devices even
though I only added the additional drives to just /home:

btrfs device usage /home
/dev/nvme0n1p2, ID: 1
   Device size:           931.01GiB
   Device slack:            3.50KiB
   Data,single:           707.01GiB
   Metadata,DUP:            2.00GiB
   Unallocated:           222.00GiB

/dev/sda1, ID: 2
   Device size:           931.51GiB
   Device slack:            3.50KiB
   Data,single:           708.00GiB
   Metadata,DUP:            2.00GiB
   Unallocated:           221.51GiB

/dev/sdb1, ID: 3
   Device size:           931.51GiB
   Device slack:              0.00B
   Data,single:           705.00GiB
   Metadata,DUP:            4.00GiB
   Unallocated:           222.51GiB

/dev/sdc1, ID: 5
   Device size:             3.64TiB
   Device slack:            3.50KiB
   Data,single:           946.00GiB
   Metadata,DUP:            4.00GiB
   System,DUP:             64.00MiB
   Unallocated:             2.71TiB

####################################

btrfs device usage /
/dev/nvme0n1p2, ID: 1
   Device size:           931.01GiB
   Device slack:            3.50KiB
   Data,single:           707.01GiB
   Metadata,DUP:            2.00GiB
   Unallocated:           222.00GiB

/dev/sda1, ID: 2
   Device size:           931.51GiB
   Device slack:            3.50KiB
   Data,single:           708.00GiB
   Metadata,DUP:            2.00GiB
   Unallocated:           221.51GiB

/dev/sdb1, ID: 3
   Device size:           931.51GiB
   Device slack:              0.00B
   Data,single:           705.00GiB
   Metadata,DUP:            4.00GiB
   Unallocated:           222.51GiB

/dev/sdc1, ID: 5
   Device size:             3.64TiB
   Device slack:            3.50KiB
   Data,single:           946.00GiB
   Metadata,DUP:            4.00GiB
   System,DUP:             64.00MiB
   Unallocated:             2.71TiB

Anyhow:
What actually does the recovery mode differently then a normal boot that it is
able to advance to the desktop which fails during a normal boot no matter if
you target runlevel 5 or 3 and shows me text output during boot?
Why does the recovery mode produce logs but the normal boot doesn't?
Or did I just do something incredibly stupid?

Kind regards,
Imo


You are receiving this mail because: