[Bug 1090122] New: Leap 15 b206.1 - Fallback from Wayland to Xorg fail
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122 Bug ID: 1090122 Summary: Leap 15 b206.1 - Fallback from Wayland to Xorg fail Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.0 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: X.Org Assignee: xorg-maintainer-bugs@forge.provo.novell.com Reporter: tutux84@onenetbeyond.org QA Contact: xorg-maintainer-bugs@forge.provo.novell.com Found By: --- Blocker: --- Hi, In /etc/gdm/custom.conf, if WaylandEnable is set to false, after rebooting and typing login&password into the Gnome login screen, the system hang. I suspect a kernel panic but I don't know what log I can provide to prove it since it seems impossible to switch to a console TTY : ctrl+alt+F1 to F12 doesn't respond once the system is frozen. The problem was also present in the previous build (197.1 I believe). Reproducibility: always For your info, I need to fallback to Xorg because Wayland doesn't detect my external screen (I haven't had time to look for troubleshoot tips yet but I may open a bug report in a few days). =========================== Some info about my system is following. In a nutshell: a 2017 Optimus Laptop without any driver installed apart from those provided at the install process. I also use an encrypted /home. uname -a: Linux linux-5udt 4.12.14-lp150.8-default #1 SMP Sat Apr 7 05:12:52 UTC 2018 (8719fc4) x86_64 x86_64 x86_64 GNU/Linux inxi -F: System: Host: linux-5udt Kernel: 4.12.14-lp150.8-default x86_64 bits: 64 Desktop: Gnome 3.26.2 Distro: openSUSE Leap 15.0 Beta Machine: Device: laptop System: GIGABYTE product: P64V7 serial: HH9006711A0002 Mobo: GIGABYTE model: P64V7 serial: N/A UEFI: American Megatrends v: FB09 date: 07/28/2017 Battery BAT1: charge: 78.4 Wh 83.2% condition: 94.2/94.2 Wh (100%) CPU: Quad core Intel Core i7-7700HQ (-HT-MCP-) cache: 6144 KB clock speeds: max: 3800 MHz 1: 2800 MHz 2: 2800 MHz 3: 2800 MHz 4: 2800 MHz 5: 2800 MHz 6: 2800 MHz 7: 2800 MHz 8: 2800 MHz Graphics: Card-1: Intel Device 591b Card-2: NVIDIA GP106M [GeForce GTX 1060 Mobile] Display Server: wayland (X.org 1.19.6 ) driver: i915 tty size: 80x24 Advanced Data: N/A for root Audio: Card Intel CM238 HD Audio Controller driver: snd_hda_intel Sound: ALSA v: k4.12.14-lp150.8-default Network: Card-1: Intel Wireless 8260 driver: iwlwifi IF: wlan1 state: down mac: 9a:0c:b4:38:1c:b1 Card-2: Realtek RTL8153 Gigabit Ethernet Adapter driver: r8152 IF: eth0 state: N/A speed: N/A duplex: N/A mac: N/A Drives: HDD Total Size: 525.1GB (1.6% used) ID-1: /dev/sda model: Crucial_CT525MX3 size: 525.1GB Partition: ID-1: / size: 18G used: 6.0G (35%) fs: btrfs dev: /dev/sda4 ID-2: /var size: 18G used: 6.0G (35%) fs: btrfs dev: /dev/sda4 ID-3: /opt size: 18G used: 6.0G (35%) fs: btrfs dev: /dev/sda4 ID-4: /tmp size: 18G used: 6.0G (35%) fs: btrfs dev: /dev/sda4 ID-5: /home size: 3.0G used: 118M (4%) fs: xfs dev: /dev/dm-0 ID-6: swap-1 size: 2.15GB used: 0.00GB (0%) fs: swap dev: /dev/sda6 Sensors: None detected - is lm-sensors installed and configured? Info: Processes: 321 Uptime: 0:24 Memory: 1449.3/15918.5MB Init: systemd runlevel: 5 Client: Shell (bash) inxi: 2.3.40 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c1
Max Staudt
In /etc/gdm/custom.conf, if WaylandEnable is set to false, after rebooting and typing login&password into the Gnome login screen, the system hang. I suspect a kernel panic but I don't know what log I can provide to prove it since it seems impossible to switch to a console TTY : ctrl+alt+F1 to F12 doesn't respond once the system is frozen.
Anything to be found in the system journal? Maybe the system had time to write an error message to disk - please use journalctl to have a look. On the other hand, maybe the X server hung, but the system itself is still alive, and VT switching is dead because X and GDM (or maybe earlier: Plymouth and GDM) are fighting for VT_SETMODE and produce a deadlock in the VT subsystem. Been there before.
For your info, I need to fallback to Xorg because Wayland doesn't detect my external screen (I haven't had time to look for troubleshoot tips yet but I may open a bug report in a few days).
Sigh. That's something for the desktop team, I guess.
CPU: Quad core Intel Core i7-7700HQ (-HT-MCP-) cache: 6144 KB
Kaby Lake - that's pretty darn new. I suspect all display outputs are connected to the Intel GPU. Once the Nvidia card is blocked, things should "just work".
Graphics: Card-1: Intel Device 591b Card-2: NVIDIA GP106M [GeForce GTX 1060 Mobile]
Whoops. By default, the nouveau kernel driver is in use, which is known to do funky things with some cards. Can you please blacklist the nouveau kernel module, rebuild the initramfs (by calling mkinitrd) and then reboot? Maybe that'll fix it... You can use lsinird to check that nouveau.ko is not contained in the resulting initramfs. Thanks! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c2
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c3
--- Comment #3 from Vladimir FROMENT
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c4
--- Comment #4 from Vladimir FROMENT
Ok. Please attach /var/log/gdm/greeter.log and /home/<user>/.local/share/xorg/Xorg.1.log first. You may end up disabling one of your two GPUs in order to get rid of these issues though.
/var/log/gdm is empty on my system. And there is no ~/.local/share/xorg folder in my case. I will try to disable nouveau module right now. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c5
--- Comment #5 from Vladimir FROMENT
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c6
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c8
--- Comment #8 from Vladimir FROMENT
Another option would be to disable Intel graphics (in Firmware) - if possible and then run NVIDIA's proprietary driver. But I'm not sure, whether the hardware supports this (for all needed outputs).
Do you mean disabling the Intel GPU in the BIOS ? It is not possible with this laptop. Eventually, by following [1] and [2], I could fix the fallback issue by setting the kernel parameter "i915.enable_guc=1". This option apparently enable advanced drivers for recent Intel chipsets. Following [2] advices, I also added enable_rc6=1, enable_fbc=1, enable_psr=1, disable_power_well=0 and semaphores=1. That seems not to have introduced any regression in my use cases. Either under Wayland and Xorg. So the bug report can be considered fixed from my point of view (although my external screen is still not detected, which is odd because Ubuntu 17.10 does it, but that's another story). Unless you need more info/logs from me ? [1] https://wiki.archlinux.org/index.php/intel_graphics [2] https://gist.github.com/Brainiarc7/aa43570f512906e882ad6cdd835efe57 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c9
Stefan Dirsch
(In reply to Stefan Dirsch from comment #7)
Another option would be to disable Intel graphics (in Firmware) - if possible and then run NVIDIA's proprietary driver. But I'm not sure, whether the hardware supports this (for all needed outputs).
Do you mean disabling the Intel GPU in the BIOS ? It is not possible with this laptop.
That's why I wrote *if possible*. ;-) Obviously, this is not an option on your system then.
Eventually, by following [1] and [2], I could fix the fallback issue by setting the kernel parameter "i915.enable_guc=1". This option apparently enable advanced drivers for recent Intel chipsets. Following [2] advices, I also added enable_rc6=1, enable_fbc=1, enable_psr=1, disable_power_well=0 and semaphores=1. That seems not to have introduced any regression in my use cases. Either under Wayland and Xorg.
So the bug report can be considered fixed from my point of view (although my external screen is still not detected, which is odd because Ubuntu 17.10 does it, but that's another story). Unless you need more info/logs from me ?
[1] https://wiki.archlinux.org/index.php/intel_graphics [2] https://gist.github.com/Brainiarc7/aa43570f512906e882ad6cdd835efe57
Well, I would call this a workaround, not a fix. Seems option "i915.enable_guc=1" is enough to fix the issue for you, right? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c10
--- Comment #10 from Vladimir FROMENT
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c11
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c12
Max Staudt
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c13
--- Comment #13 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c14
--- Comment #14 from Stefan Dirsch
Graphics: Card-1: Intel Device 591b
Takashi, sure this is CFL (Coffelake)? #define INTEL_KBL_GT2_IDS(info) \ [...] INTEL_VGA_DEVICE(0x591B, info), /* Halo GT2 */ \ Coffeelake has different IDs (0x3E??) according to current linux/drm/i915_pciids.h. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c15
--- Comment #15 from Stefan Dirsch
What shows /sys/module/i915/parameters/enable_guc if you don't pass the value -1? After loading the driver, it'll be set to either 0, 1 or 2.
According to https://wiki.archlinux.org/index.php/intel_graphics#Enable_GuC_.2F_HuC_firmw... this came with Kernel 4.16. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c16
--- Comment #16 from Stefan Dirsch
(In reply to Takashi Iwai from comment #13)
What shows /sys/module/i915/parameters/enable_guc if you don't pass the value -1? After loading the driver, it'll be set to either 0, 1 or 2.
According to
https://wiki.archlinux.org/index.php/intel_graphics#Enable_GuC_. 2F_HuC_firmware_loading
this came with Kernel 4.16.
But maybe it's already in sle15/Leap 15 with our backports. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c17
--- Comment #17 from Takashi Iwai
Graphics: Card-1: Intel Device 591b
Takashi, sure this is CFL (Coffelake)?
Sorry, I was confused. The chip in question is Kaby Lake (KBL). But the question still stands. Both KBL and CFL use the same firmware, and guc loading should have been enabled without the extra option. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c18
--- Comment #18 from Takashi Iwai
(In reply to Stefan Dirsch from comment #15)
(In reply to Takashi Iwai from comment #13)
What shows /sys/module/i915/parameters/enable_guc if you don't pass the value -1? After loading the driver, it'll be set to either 0, 1 or 2.
According to
https://wiki.archlinux.org/index.php/intel_graphics#Enable_GuC_. 2F_HuC_firmware_loading
this came with Kernel 4.16.
But maybe it's already in sle15/Leap 15 with our backports.
Yes. SLE15 / openSUSE Leap 15.0 kernel already got tons of backports and i915 driver is almost equivalent with 4.16. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c20
--- Comment #20 from Vladimir FROMENT
(In reply to Takashi Iwai from comment #13)
What shows /sys/module/i915/parameters/enable_guc if you don't pass the value -1? After loading the driver, it'll be set to either 0, 1 or 2.
For this please test without option
"i915.enable_guc=1"
(and all the other options). If needed, i.e. you're using an /etc/modprobe.d file snippet, recreate initrd afterwards via
mkinitrd
So after disabling all above-mentionned options in Yast > Bootloader and rebooting, the value of /sys/module/i915/parameters/enable_guc is 0. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c21
--- Comment #21 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c22
Stefan Dirsch
Thanks. I checked the recent code, and indeed the default value is zero. It was changed from -1 to 0 some time ago due to the latency issues and S4 resume problem, according to the git log.
Ok. Interesting.
If enable_guc=1 option alone really helps, I believe it's worth to report to upstream devs.
Which is supposed to be done by us or the reporter?
It'd be great if you can double-check it.
That's what the reporter did before (comment #10). So should he really *double* check literally? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c23
--- Comment #23 from Takashi Iwai
(In reply to Takashi Iwai from comment #21)
Thanks. I checked the recent code, and indeed the default value is zero. It was changed from -1 to 0 some time ago due to the latency issues and S4 resume problem, according to the git log.
Ok. Interesting.
If enable_guc=1 option alone really helps, I believe it's worth to report to upstream devs.
Which is supposed to be done by us or the reporter?
At best someone who own the hardware and can test, so the reporter would be the best option. Most likely the upstream devs will ask testing the latest development version or some patch, so we should be in Cc, of course.
It'd be great if you can double-check it.
That's what the reporter did before (comment #10). So should he really *double* check literally?
Yes, we need to test with the latest upstream version before reporting to upstream, at least. 4.17-rc kernel is found in OBS Kernel:HEAD repo, and 4.16.x is in OBS Kernel:stable repo. I *guess* the problem remains, but if these version work, there is another hope for a quicker fix. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c24
--- Comment #24 from Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c25
--- Comment #25 from Vladimir FROMENT
Thanks. Vladimir, could you please test our KOTD? (currently 4.17-rc)
Installed KOTD with this command: rpm -i --force http://download.opensuse.org/repositories/Kernel:/HEAD/standard/x86_64/kerne... But the system fails to boot correctly. I get an error message at boot time saying "[FAILED] Failed to start Load Kernel Modules". The system doesn't get to load gdm and end up in maintenance mode. The journalctl logs will be attached right away. It seems related to the encrypted /home. I can reinstall Leap Beta in last version with an unencrypted /home but that would not be representing my normal setup. On another hand, prior to installing KOTD, I upgraded my Leap Beta via "zypper dup" and the workaround doesn't work anymore, even with i915.enable_guc=1. Wayland load but fallback to Xorg doesn't (same symptoms as before). It was beta 206.1 before upgrade. Let me know what you would need from me to move forward. I should have some time this weekend to test multiple setups if needed. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c26
--- Comment #26 from Vladimir FROMENT
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c27
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c28
--- Comment #28 from Max Staudt
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c30
--- Comment #30 from Vladimir FROMENT
Looks like a kernel or base system bug to me. Totally unrelated to your graphics problems.
Agreed. (In reply to Stefan Dirsch from comment #29)
Hope you installed the KOTD in addition to the existing one and can still boot the old one?
Yes, I had no problem to boot on the old kernel. All is working fine. But that kind of messed up the troubleshoot path héhé. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c31
--- Comment #31 from Vladimir FROMENT
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c32
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c33
--- Comment #33 from Vladimir FROMENT
Hmm. So I guess this is again *without* "WaylandEnable=false" in /etc/gdm/custom.conf, right? Can you confirm this?
I confirm. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c34
Stefan Dirsch
I still have some instabilities in one use case on Xorg I will report in detail > when I have more time in a few days (external screen detected on Xorg, but after > reboot, GDM do not show up anymore). But regarding the initial report, I would say the issue is solved or workarounded ;)
Please use a separate bugreport this then, but you could refer to this bugreport there. Thanks! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122
http://bugzilla.opensuse.org/show_bug.cgi?id=1090122#c35
Stefan Dirsch
participants (1)
-
bugzilla_noreply@novell.com