[Bug 966255] New: System does not boot
http://bugzilla.opensuse.org/show_bug.cgi?id=966255 Bug ID: 966255 Summary: System does not boot Classification: openSUSE Product: openSUSE Tumbleweed Version: 2015* Hardware: Other OS: Other Status: NEW Severity: Critical Priority: P5 - None Component: Bootloader Assignee: jsrain@suse.com Reporter: rjschwei@suse.com QA Contact: jsrain@suse.com Found By: --- Blocker: --- kernel 4.4.0.3 Has encrypted user partition. Hangs at Ignoring BGRT: Invalid version 0 (expected 1) This appears to happen right before the point where I would expect to enter my passphrase. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c1
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c2
Robert Schweikert
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c3
--- Comment #3 from Robert Schweikert
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c4
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c5
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c6
--- Comment #6 from Robert Schweikert
OK, so I suppose that the BGRT message is just a red herring, it appeared in the past, too? In anyway, it'd be helpful to know whether this happens by the kernel update or by others...
I just added the message to provide an indication where in the boot process things are going wrong. I did not think it was an indicator for the problem. Sorry if that was not clear. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c7
--- Comment #7 from Robert Schweikert
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c8
Robert Schweikert
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c9
--- Comment #9 from Takashi Iwai
Experiencing the same symptom with kernel-default-4.3.3-1.1.gda39cbd.x86_64.rpm
Please don't make me go back in time through more kernel versions.
If you can conclude certainly that this is no kernel regression, I'm happy to hear. If not, it'd be helpful for further tests; there are other kernel versions in my OBS home:tiwai:kernel:$VERSION repos :) (4.1 is found in Leap.) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c10
--- Comment #10 from Robert Schweikert
From the log:
Feb 12 07:53:31 rush systemd[1]: Started Forward Password Requests to Plymouth. Feb 12 07:53:31 rush audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-journal-flush comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' After this the log continues to accumulate messages and then eventually... Feb 12 07:53:47 rush systemd[1]: Received SIGINT. Which is basically when I hit Ctrl-Alt Delete to restart the system as I simply cannot enter the passphrase for the encrypted partition. I would say it's a plymouth problem. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
PatrickD Garvey
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c11
Fabian Vogt
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c12
--- Comment #12 from Ismail Donmez
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c13
--- Comment #13 from Fabian Vogt
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c14
--- Comment #14 from Ismail Donmez
I tried to debug it, but the issue is that plymouth works fine if started manualy, even in the initrd. As a better workaround, I added plymouth to the list of modules to omit from the initrd, so that it uses the plymouth on / and not the one inside the initrd and it works fine. I guess it's some kind of timing issue...
Do we really need plymouth in initrd? That really makes it harder to debug it. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c15
--- Comment #15 from Fabian Vogt
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
Robert Kaiser
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c18
Daniel Zeleny
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
James Mason
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c24
Jacob W
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c25
--- Comment #25 from Jacob W
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c26
--- Comment #26 from Fabian Vogt
Created attachment 667207 [details] cryptsetup boot password prompt goes into background as a job
That's not the plymouth prompt, so a totally different issue. It didn't even try to use plymouth for password asking.
Password prompt is not shown and plymouth theme never shows up, instead it freezes the system where VT switching does not work.
If even the cursor stops blinking, the kernel either froze or panics -> kernel bug. To confirm, run on a booted system as root on a TTY:
rcxdm stop plymouthd --no-daemon --no-boot-log --tty=/dev/tty0 --debug --mode=boot
and on a different tty also as root
plymouth show-splash
and you should the the splash screen (on TTY7).
Forcing the system to boot without plymouth (press e on grub's kernel list) shows what looks like cryptsetup trying to prompt for the password, but instead instantly being backgrounded as a job. After more processes go through their boot procedure, the boot process stops and looks frozen. If I press a button on the keyboard, it forces the password prompt to re-appear. Now I can enter my password and everything works from there.
It's not backgrounded, it's the current TTY so systemd uses it, although it's also being used by something else... -> systemd bug(?) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c27
--- Comment #27 from Jacob W
(In reply to Jacob W from comment #25)
Created attachment 667207 [details] cryptsetup boot password prompt goes into background as a job
That's not the plymouth prompt, so a totally different issue. It didn't even try to use plymouth for password asking.
This is when booting without kernel options "splash=silent" and "quiet". So of course plymouth is not started, because it's not supposed to. The point of this screenshot is to show that even without plymouth, the password prompting is incorrect. If I do have the above kernel options enabled, I do not get a splash screen. All I get is a blank black screen. VT switching does not work. This blank black screen shows up where previously, prior to this bug, I would get the plymouth password prompt.
Password prompt is not shown and plymouth theme never shows up, instead it freezes the system where VT switching does not work.
If even the cursor stops blinking, the kernel either froze or panics -> kernel bug.
Logs show no mention of kernel panic. Kernel loads fine from what I can tell. Have not had a kernel panic since I started having this cryptsetup issue. I've check both /var/log/messages and systemd journald.
To confirm, run on a booted system as root on a TTY:
rcxdm stop plymouthd --no-daemon --no-boot-log --tty=/dev/tty0 --debug --mode=boot
and on a different tty also as root
plymouth show-splash
and you should the the splash screen (on TTY7).
I do not see the splash screen. All I see is the regular output of booting: [ OK ] .... [ OK ] .... No errors, nothing unusual, no interruptions, etc. None of the above commands showed errors from what I could tell.
Forcing the system to boot without plymouth (press e on grub's kernel list) shows what looks like cryptsetup trying to prompt for the password, but instead instantly being backgrounded as a job. After more processes go through their boot procedure, the boot process stops and looks frozen. If I press a button on the keyboard, it forces the password prompt to re-appear. Now I can enter my password and everything works from there.
It's not backgrounded, it's the current TTY so systemd uses it, although it's also being used by something else... -> systemd bug(?)
From the above, I am guessing that this is a systemd issue. When using
Exactly! The password prompt should halt all further processes until the correct password is entered, but this is not the case. The password prompt is suppressed and other things just keep loading. I call this backgrounding, because that's exactly how it looks like: the job (it's literally called that) of cryptsetup password prompt is being suppressed. Hitting a button later "reveals" the job (to me, that's like foregrounding it). Maybe that's bad use of terminology. Yes, this looks like a systemd bug. I've been trying to say that from the beginning. Isn't this bug report a systemd issue? I'm confused. Like I mentioned before, I do not think plymouth is to blame, but systemd because even without plymouth, there is an issue with the password prompt (see previously attached screenshot). See also (like you already have) my comments on bug 942940 To recap: 1) without kernel options "splash=silent" and "quiet", it's clear cryptsetup password prompt is not being correctly initiated (see previously attached screenshot). 2) with kernel option "quiet" but NOT "splash=silent" (or with "splash=verbose"), password prompt is correctly shown and halts everything until a correct password is provided. Once provded, everything continues to load. This is how it's supposed to work. 3) with both "splash=silent" and "quiet", the instance plymouth is supposed show the password prompt, what I assume is the graphical system freezes and all I get is a blank black screen. I cannot switch VT. plymouth, it is getting two conflicting "signals": 1) the passowrd prompt is needed and 2) don't show password prompt. This is exactly what happens without kernel options "splash=silent" and "quiet" (see previously attached screenshot): cryptsetup is asked to show password prompt and 2) the password prompt is not shown / hidden (I call this the job being backgrounded). But because there is no plymouth being loaded, the system does not freeze and you wait until other things load, press a button on the keyboard, and the password prompt is then foregrounded. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c28
--- Comment #28 from Fabian Vogt
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
Antoine Belvire
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c29
shashank gaur
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
Ismail Donmez
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c30
John Chufar
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c31
--- Comment #31 from John Chufar
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c32
Jacob W
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c33
Fabian Vogt
I'm glad that people have found my workaround helpful.
Just like Shashank and John, this bug still affects me after all this time and I still use the workaround I gave 1.5 years ago myself.
It's not easy to debug as it only happens reproducibly on very certain setups. For instance, it no longer shows up on my system where I originally debugged the issue for a while, I'll also removed plymouth from all systems I have access to meanwhile. I recommend everyone to do the same, it's an extremely annoying and unreliable piece of software. There's currently work ongoing to drop plymouth from openSUSE and replace it with a different kernel-based bootsplash that is free from race conditions. Still, I adjusted the title and reassigned it to the plymouth maintainer. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
http://bugzilla.opensuse.org/show_bug.cgi?id=966255#c34
--- Comment #34 from Jacob W
I adjusted the title and reassigned it to the plymouth maintainer.
Great, I'm sure Zhao Qiang will do a great job fixing this critical bug! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
Michal Suchanek
http://bugzilla.opensuse.org/show_bug.cgi?id=966255
Eberhard Harbrink
participants (1)
-
bugzilla_noreply@novell.com