[Bug 1013200] New: sddm-greeter dumped core
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200 Bug ID: 1013200 Summary: sddm-greeter dumped core Classification: openSUSE Product: openSUSE Distribution Version: Leap 42.2 Hardware: x86-64 OS: Other Status: NEW Severity: Major Priority: P5 - None Component: X.Org Assignee: xorg-maintainer-bugs@forge.provo.novell.com Reporter: patrick.schaaf@yalwa.com QA Contact: xorg-maintainer-bugs@forge.provo.novell.com Found By: --- Blocker: --- On a recent Leap 42.2 install, after applying the last round of updates and rebooting the system, I get a black screen instead of the SDDM greeter. System logs show: 2016-12-02T08:49:52.937290+01:00 linux systemd-coredump[1297]: Process 460 (plymouthd) of user 0 dumped core. 2016-12-02T08:50:09.705204+01:00 rofl systemd-coredump[2023]: Process 1974 (sddm-greeter) of user 481 dumped core. Running "systemctl restart display-manager" on VT1 as root, once, brings up the greeter, and everything seems to be fine then. But the issue reoccurs when I reboot a second time. I used the system with an uptime of >3 days before, and did not encounter this issue previously. Among the updates applied, was a new xorg-x11-server package, and kernel 4.8.11 (from the "official" http://download.opensuse.org/repositories/Kernel:/stable/standard/ repository) - that kernel repository was in use before with a slightly older 4.8.x, so I don't think that's the culprit. I'll attach the two /var/lib/systemd/coredump/ files created. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c1
--- Comment #1 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c2
--- Comment #2 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c3
--- Comment #3 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c4
--- Comment #4 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c6
--- Comment #6 from Stefan Dirsch
(from the "official" http://download.opensuse.org/repositories/Kernel:/stable/standard/ repository)
For sure this is not the "official" kernel. This is a bleeding edge "kernel-of-the-day". Therefore I suggest to go back to our latest Leap 42.2 kernel (version 4.4!) and test again. For me this looks like a race between plymouth and X. Honestly if you can live without a splash screen for boot I recommend to disable plymouth by adding "plymouth.enable=0" to the kernel boot options. ;-) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c7
--- Comment #7 from Patrick Schaaf
Please provide backtraces of plymouth and sddm-greeter.
80 debuginfo installs later.....
Small side note, following all the gdb "zypper install this or that debuginfo",
for sddm, two fail:
zypper install Mesa-libGL1-debuginfo-11.2.2-158.1.x86_64
Mesa-libglapi0-debuginfo-11.2.2-158.1.x86_64
Package 'Mesa-libGL1-debuginfo-11.2.2-158.1.x86_64' not found.
Package 'Mesa-libglapi0-debuginfo-11.2.2-158.1.x86_64' not found.
Here are the backtraces I could generate, relative to the coredumps attached
previously:
==== plymouthd ================================================
Reading symbols from /usr/sbin/plymouthd...Reading symbols from
/usr/lib/debug/usr/sbin/plymouthd.debug...done.
done.
[New LWP 460]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `@usr/sbin/plymouthd --mode=boot
--pid-file=/run/plymouth/pid --attach-to-sessio'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 ply_event_loop_get_destination_from_fd_watch (loop=0x7974742f7665642f,
watch=0x21) at ply-event-loop.c:449
449 ply-event-loop.c: No such file or directory.
(gdb) bt
#0 ply_event_loop_get_destination_from_fd_watch (loop=0x7974742f7665642f,
watch=0x21) at ply-event-loop.c:449
#1 0x00007f2765b9e1db in ply_event_loop_stop_watching_fd (
loop=0x7974742f7665642f, watch=0x21) at ply-event-loop.c:750
#2 0x00007f276598a84c in ply_terminal_close (terminal=0xab7cc0)
at ply-terminal.c:686
#3 0x00007f2765ba1c33 in ply_hashtable_foreach (hashtable=0xab78e0,
func=0x7f276598a7c0
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c8
Patrick Schaaf
Hmm. The only difference in xorg-x11-server between 42.2 release and update I can find is
U_modesetting-set-driverPrivate-to-Null-after-closing-fd.patch Prevent crash when unplugging displaylink device. (bnc#1011570)
I don't think this is the culprit and it only affects modesetting driver users, which you probably are not with your AMD card (check /var/log/Xorg.0.log or wherever your displaymanager writes the X log into).
I'll attach my current Xorg.0.log, not sure exactly what to look for there. There is a modesetting_drv loaded and notes about using "kms".
(from the "official" http://download.opensuse.org/repositories/Kernel:/stable/standard/ repository)
For sure this is not the "official" kernel. This is a bleeding edge "kernel-of-the-day".
Yeah! But it's a nice one actually - never failed me for the last 6 months. Anywway....
Therefore I suggest to go back to our latest Leap 42.2 kernel (version 4.4!) and test again.
For me this looks like a race between plymouth and X. Honestly if you can live without a splash screen for boot I recommend to disable plymouth by adding "plymouth.enable=0" to the kernel boot options. ;-)
Did both - back to the normal 42.2 kernel (kernel-default-4.4.27-2.1.x86_64), the 4.8.x kernels removed. Tried with and without plymouth. NO CHANGE regarding the issue, sddm-greeter even coredumps when plymouth is both disabled and uninstalled (packages, initrd rebuild, though that I tested only with the 4.8.11 kernels) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c9
--- Comment #9 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c10
--- Comment #10 from Fabian Vogt
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c11
--- Comment #11 from Stefan Dirsch
Created attachment 704617 [details] Xorg log
modesetting is loaded (among other drivers), but radeon driver is being used. So the xorg-x11-server update can't be the culprit here. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c12
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c13
Egbert Eich
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c14
--- Comment #14 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c15
--- Comment #15 from Patrick Schaaf
Ok. Apparently Xserver couldn't be started. Please try this with original update kernel 4.4 for Leap 42.2 in place and plymouth uninstalled/disabled.
X -retro -verbose 7 -logverbose 7 : 99 & sleep 3; DISPLAY=:99 xterm & and attach /var/log/Xorg.99.log
I did that, after "systemctl set-default multi-user" and reboot, so no plymouth or display manager autostart. It worked - X started, I got the xterm, and could terminate. Do you still want to see the Xorg.99.log ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c16
--- Comment #16 from Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c17
Max Staudt
It could be that the crashing plymouthd left the vt hogged. Maybe booting without plymouth (plymouth.enable=0 on the command line) helps and the sddm greeter screen is shown.
AFAIR, Max knows all the ins and outs of plymouth hogging VTs.
This is clearly not the issue here. Patrick did the right thing: He uninstalled Plymouth *and* rebuilt rhe initrd. After this step, Plymouth can no longer hog the VT. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c18
--- Comment #18 from Max Staudt
I did that, after "systemctl set-default multi-user" and reboot, so no plymouth or display manager autostart.
Not sure if that keeps Plymouth from running. Also, plymouth.enable=0 is a parameter that is evaluated by Plymouth itself. For what it's worth, can you please keep Plymouth uninstalled (and possibly locked with zypper addlock) until we have resolved this issue? That way we make sure we don't run into something Plymouth related as well. Of course, don't forget to rebuild the initrd after removing Plymouth :) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c19
--- Comment #19 from Patrick Schaaf
(In reply to Patrick Schaaf from comment #15)
I did that, after "systemctl set-default multi-user" and reboot, so no plymouth or display manager autostart.
Not sure if that keeps Plymouth from running. Also, plymouth.enable=0 is a parameter that is evaluated by Plymouth itself.
For what it's worth, can you please keep Plymouth uninstalled (and possibly locked with zypper addlock) until we have resolved this issue? That way we make sure we don't run into something Plymouth related as well.
Of course, don't forget to rebuild the initrd after removing Plymouth :)
plymouth was both disabled and uninstalled from tbe beginning of my comment:14 I'm reinstalling the system from zero now, back in 45 minutes... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200 http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c20 --- Comment #20 from Egbert Eich--- (In reply to Max Staudt from comment #17) > > This is clearly not the issue here. Patrick did the right thing: He > uninstalled Plymouth *and* rebuilt rhe initrd. After this step, Plymouth can > no longer hog the VT. Ok, but the sddm greeter is calling abort() because it cannot connect to the Xserver. This is either because the Xserver isn't ready for connections, yet or sddm is using the wrong credentials - or something is grabbing it (unlikely). When the Xserver is running but just showing a black screen one could try to obtain the credentials from the Xserver command line and then try to log in: >From a root shell (remote or after a VT switch) do: 1. # ps aux | grep Xorg get the argument after '-auth' 2. # export XAUTHORITY= 3. # export DISPLAY=:0.0 4. # xterm Switching the DM would be something to try out as well... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c21
--- Comment #21 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c22
--- Comment #22 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c23
--- Comment #23 from Patrick Schaaf
When the Xserver is running but just showing a black screen one could try to obtain the credentials from the Xserver command line and then try to log in: From a root shell (remote or after a VT switch) do: 1. # ps aux | grep Xorg get the argument after '-auth' 2. # export XAUTHORITY=
3. # export DISPLAY=:0.0 4. # xterm
That doesn't work... See transscript below. I also tried it as user, with /run/addm/{...} chmodded 644, still no joy. rofl:~ # systemctl status display-manager display-manager.service - X Display Manager Loaded: loaded (/usr/lib/systemd/system/display-manager.service; enabled; vendor preset: enabled) Active: ^[[0;1;32mactive (running)^[[0m since Fri 2016-12-02 16:10:35 CET; 2min 1s ago Process: 1300 ExecStart=/usr/lib/X11/display-manager start (code=exited, status=0/SUCCESS) Main PID: 1390 (sddm) Tasks: 4 (limit: 512) CGroup: /system.slice/display-manager.service ├─1390 /usr/bin/sddm └─1402 /usr/bin/X -nolisten tcp -auth /run/sddm/{b8afb90f-46c8-4438-9a97-1bf82d128e7b} -background none -noreset -displayfd 18 vt7 Dec 02 16:10:36 linux sddm[1390]: Socket server started. Dec 02 16:10:36 linux sddm[1390]: Greeter starting... Dec 02 16:10:36 linux sddm[1390]: Adding cookie to "/run/sddm/{b8afb90f-46c8-4438-9a97-1bf82d128e7b}" Dec 02 16:10:36 linux sddm-helper[1671]: [PAM] Starting... Dec 02 16:10:36 linux sddm-helper[1671]: [PAM] Authenticating... Dec 02 16:10:36 linux sddm-helper[1671]: [PAM] returning. Dec 02 16:10:36 linux sddm-helper[1671]: pam_unix(sddm-greeter:session): session opened for user sddm by (uid=0) Dec 02 16:10:53 rofl sddm[1390]: Greeter session started successfully Dec 02 16:10:53 rofl sddm[1390]: ^[[0;1;39mAuth: sddm-helper exited with 6^[[0m Dec 02 16:10:53 rofl sddm[1390]: Greeter stopped. rofl:~ # coredumpctl list TIME PID UID GID SIG PRESENT EXE [... stuff from previous boots ...] Fri 2016-12-02 16:10:54 CET 2045 481 479 6 * /usr/bin/sddm-greeter rofl:~ # export XAUTHORITY=/run/sddm/{b8afb90f-46c8-4438-9a97-1bf82d128e7b\} rofl:~ # export DISPLAY=:0.0 rofl:~ # xterm No protocol specified Warning: This program is an suid-root program or is being run by the root user. The full text of the error or warning message cannot be safely formatted in this environment. You may get a more descriptive message by running the program as a non-root user or by removing the suid bit on the executable. xterm: Xt error: Can't open display: %s -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c24
Patrick Schaaf
Switching the DM would be something to try out as well...
Switched to kdm - NO ISSUE. (accidentally first installed gdm, that drew in over 200 packages, of which "zypper remove -u gdm" only could remove 97 again. just noting...) I'll give up for today. But if there's other things to try, I can do that during the weekend. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c25
--- Comment #25 from Egbert Eich
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c26
--- Comment #26 from Patrick Schaaf
Did you check the existence of this file: /run/sddm/{b8afb90f-46c8-4438-9a97-1bf82d128e7b}?
Yes. It does exist.
If it does exist, # export XAUTHORITY=/run/sddm/{b8afb90f-46c8-4438-9a97-1bf82d128e7b} # xauth list should give some output.
It does. Something like: rofl:~ # xauth info Authority file: /run/sddm/{e38ebf58-e9d4-453d-8562-d357d7ab5238} File new: no File locked: no Number of entries: 1 Changes honored: yes Changes made: no Current input: (argv):1 All fine, I think. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c27
--- Comment #27 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c28
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200
http://bugzilla.opensuse.org/show_bug.cgi?id=1013200#c29
--- Comment #29 from Patrick Schaaf
participants (1)
-
bugzilla_noreply@novell.com