Latest TW 20220212 fails to boot after dup - maybe an nvidia issue
Just a heads up to anyone using nvidia considering dup'ing to 20220212 with kernel 5.16.8-1-default. After upgrading I cannot boot, not even to run level 3. It may be because I'm using the nvidia proprietary driver. I've not seen it fail to boot to runlevel 3 before, so that's a surprise. The host is not reachable via ssh and it's NFS-exports are not available. I'm seeing the following errors in the journal (after booting back to 5.16.5-1-default). BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 233s! .. #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page .. PGD 100001067 P4D 100001067 PUD 1001b5067 PMD 111932067 PTE 0 .. CPU: 1 PID: 459 Comm: kworker/1:2 Tainted: P OE 5.16.8-1-default #1 openSUSE Tumbleweed 257f8f36371552cd38032922fd021edb6811ecdc Journal Entry 2022-02-14 17:37:17.393907 .. Workqueue: events drm_fb_helper_damage_work RIP: 0010:memcpy_toio+0x23/0x50 Hardware concerned: CPU: Info: 6-Core model: AMD FX-6300 bits: 64 type: MCP cache: L2: 2 MiB Speed: 1404 MHz min/max: 1400/3500 MHz Core speeds (MHz): 1: 1404 2: 1399 3: 3450 4: 1869 5: 1404 6: 1404 Graphics: Device-1: NVIDIA TU116 [GeForce GTX 1650 SUPER] driver: nvidia v: 470.103.01 Display: x11 server: X.Org 1.21.1.3 driver: loaded: nvidia resolution: 1: 3360x2100~60Hz 2: 3840x2160~60Hz OpenGL: renderer: NVIDIA GeForce GTX 1650 SUPER/PCIe/SSE2 v: 4.6.0 NVIDIA 470.103.01 There's no time for me to look further at this today. Cheers, Michael
Am Montag, 14. Februar 2022, 06:14:01 CET schrieb Michael Hamilton:
Just a heads up to anyone using nvidia considering dup'ing to 20220212 with kernel 5.16.8-1-default.
After upgrading I cannot boot, not even to run level 3. It may be because I'm using the nvidia proprietary driver. I've not seen it fail to boot to runlevel 3 before, so that's a surprise. The host is not reachable via ssh and it's NFS-exports are not available.
same here; had to rollback to 20220210 because with kernel 5.16.8 the graphical login would keep restarting and eventually the machine freezes up. please file a bugreport on bugzilla since you have more logs than me. Cheers MH -- Mathias Homann Mathias.Homann@openSUSE.org Jabber (XMPP): lemmy@tuxonline.tech Matrix: @mathias:eregion.de IRC: [Lemmy] on freenode and ircnet (bouncer active) keybase: https://keybase.io/lemmy gpg key fingerprint: 8029 2240 F4DD 7776 E7D2 C042 6B8E 029E 13F2 C102
On Monday 14 February 2022, Mathias Homann wrote:
Am Montag, 14. Februar 2022, 06:14:01 CET schrieb Michael Hamilton:
Just a heads up to anyone using nvidia considering dup'ing to 20220212 with kernel 5.16.8-1-default.
After upgrading I cannot boot, not even to run level 3. It may be because I'm using the nvidia proprietary driver. I've not seen it fail to boot to runlevel 3 before, so that's a surprise. The host is not reachable via ssh and it's NFS-exports are not available.
same here; had to rollback to 20220210 because with kernel 5.16.8 the graphical login would keep restarting and eventually the machine freezes up.
please file a bugreport on bugzilla since you have more logs than me.
Cheers MH
I don't normally file bugs on new kernel releases when the nvidia driver may be to blame. Nvidia related issues generally just go away when the driver is updated. If it's happening with non-nvidia based systems, I can certainly raise a bug. I don't think the OpenSUSE folk have any say over when the official Nvidia driver will make it to the repo. As I understand it, that's in Nvidia's ballpark. I generally just back off and wait until I see a driver released. I just thought I'd post a warning because not being able to boot into the new kernel is a rare occurrence. Michael
I don't normally file bugs on new kernel releases when the nvidia driver may be to blame. Nvidia related issues generally just go away when the driver is updated. If it's happening with non-nvidia based systems, I can certainly raise a bug. This time might be different. I think 5.16.8 may be the first TW release with simpledrm[1] enabled by default. The driver installer might not be able to handle
Michael Hamilton composed on 2022-02-14 21:46 (UTC+1300): that kind of a change without major maintenance. [1] https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/thread/7... -- Evolution as taught in public schools is, like religion, based on faith, not based on science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata
Am 2022-02-14 09:46, schrieb Michael Hamilton:
I don't normally file bugs on new kernel releases when the nvidia driver may be to blame. Nvidia related issues generally just go away when the driver is updated. If it's happening with non-nvidia based systems, I can certainly raise a bug.
I don't think the OpenSUSE folk have any say over when the official Nvidia driver will make it to the repo. As I understand it, that's in Nvidia's ballpark. I generally just back off and wait until I see a driver released.
Actually, the packages are built and provided by openSUSE folks, its just that nvidia publishes them.
I just thought I'd post a warning because not being able to boot into the new kernel is a rare occurrence.
I can actually boot into the new kernel just fine - but without graphical login - the system is reachable via ssh just fine. But since it's my desktop computer that aint fun. https://bugzilla.opensuse.org/show_bug.cgi?id=1195890 -- Mathias Homann Mathias.Homann@openSUSE.org xmpp: lemmy@tuxonline.tech matrix: @mathias:eregion.de irc: [Lemmy] on liberachat and ircnet obs/pmbs: lemmy04 gpg key fingerprint: 8029 2240 F4DD 7776 E7D2 C042 6B8E 029E 13F2 C102
On Monday 14 February 2022, Mathias Homann wrote:
Am 2022-02-14 09:46, schrieb Michael Hamilton:
I don't normally file bugs on new kernel releases when the nvidia driver may be to blame. Nvidia related issues generally just go away when the driver is updated. If it's happening with non-nvidia based systems, I can certainly raise a bug.
I don't think the OpenSUSE folk have any say over when the official Nvidia driver will make it to the repo. As I understand it, that's in Nvidia's ballpark. I generally just back off and wait until I see a driver released.
Actually, the packages are built and provided by openSUSE folks, its just that nvidia publishes them.
I just thought I'd post a warning because not being able to boot into the new kernel is a rare occurrence.
I can actually boot into the new kernel just fine - but without graphical login - the system is reachable via ssh just fine.
But since it's my desktop computer that aint fun.
Thanks for raising that. After reading Felix's reply I added a comments to the bugs with the details from my email. Michael
On Monday 14 February 2022, Michael Hamilton wrote:
On Monday 14 February 2022, Mathias Homann wrote:
Am 2022-02-14 09:46, schrieb Michael Hamilton:
I don't normally file bugs on new kernel releases when the nvidia driver may be to blame. Nvidia related issues generally just go away when the driver is updated. If it's happening with non-nvidia based systems, I can certainly raise a bug.
I don't think the OpenSUSE folk have any say over when the official Nvidia driver will make it to the repo. As I understand it, that's in Nvidia's ballpark. I generally just back off and wait until I see a driver released.
Actually, the packages are built and provided by openSUSE folks, its just that nvidia publishes them.
I just thought I'd post a warning because not being able to boot into the new kernel is a rare occurrence.
I can actually boot into the new kernel just fine - but without graphical login - the system is reachable via ssh just fine.
But since it's my desktop computer that aint fun.
Thanks for raising that. After reading Felix's reply I added a comments to the bugs with the details from my email.
Michael
I awoke this morning to find that the cause of my issue has been determined. A work around has also been noted: https://bugzilla.opensuse.org/show_bug.cgi?id=1195885#c30 Which is to edit the boot line options and add initcall_blacklist=simpledrm_platform_driver_init Disabling simpledrm allows me to boot with nvidia 470 (installed the-easy-way) with 5.16.8-1-default. So far the desktop is working fine. The only difference I noticed was the brief flash of a green line in the middle of the screen just prior to the boot splash. Michael
On Tuesday 15 February 2022, Michael Hamilton wrote:
On Monday 14 February 2022, Michael Hamilton wrote:
On Monday 14 February 2022, Mathias Homann wrote:
Am 2022-02-14 09:46, schrieb Michael Hamilton:
I don't normally file bugs on new kernel releases when the nvidia driver may be to blame. Nvidia related issues generally just go away when the driver is updated. If it's happening with non-nvidia based systems, I can certainly raise a bug.
I don't think the OpenSUSE folk have any say over when the official Nvidia driver will make it to the repo. As I understand it, that's in Nvidia's ballpark. I generally just back off and wait until I see a driver released.
Actually, the packages are built and provided by openSUSE folks, its just that nvidia publishes them.
I just thought I'd post a warning because not being able to boot into the new kernel is a rare occurrence.
I can actually boot into the new kernel just fine - but without graphical login - the system is reachable via ssh just fine.
But since it's my desktop computer that aint fun.
Thanks for raising that. After reading Felix's reply I added a comments to the bugs with the details from my email.
Michael
I awoke this morning to find that the cause of my issue has been determined. A work around has also been noted:
https://bugzilla.opensuse.org/show_bug.cgi?id=1195885#c30
Which is to edit the boot line options and add
initcall_blacklist=simpledrm_platform_driver_init
Disabling simpledrm allows me to boot with nvidia 470 (installed the-easy-way) with 5.16.8-1-default. So far the desktop is working fine. The only difference I noticed was the brief flash of a green line in the middle of the screen just prior to the boot splash.
Michael
Sorry to reply to myself again... the initcall_blacklist fixes sddm/X11, but now Ctrl-Alt-F1 cannot access a text console, I just see the green (actually multi-coloured) line I mentioned earlier (others see a black screen, see the recent bug comments). So progress, but the work around is not a full solution.
Hello,
In the Message;
Subject : Re: Latest TW 20220212 fails to boot after dup - maybe an nvidia issue
Message-ID : <202202142146.22856.michael@actrix.gen.nz>
Date & Time: Mon, 14 Feb 2022 21:46:22 +1300
[MH] == Michael Hamilton
On Monday, 14 February 2022 3:44:01 PM ACDT Michael Hamilton wrote:
Just a heads up to anyone using nvidia considering dup'ing to 20220212 with kernel 5.16.8-1-default.
After upgrading I cannot boot, not even to run level 3. It may be because I'm using the nvidia proprietary driver. I've not seen it fail to boot to runlevel 3 before, so that's a surprise. The host is not reachable via ssh and it's NFS-exports are not available.
I'm seeing the following errors in the journal (after booting back to 5.16.5-1-default).
BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 233s! .. #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page .. PGD 100001067 P4D 100001067 PUD 1001b5067 PMD 111932067 PTE 0 .. CPU: 1 PID: 459 Comm: kworker/1:2 Tainted: P OE 5.16.8-1-default #1 openSUSE Tumbleweed 257f8f36371552cd38032922fd021edb6811ecdc
Journal Entry 2022-02-14 17:37:17.393907 .. Workqueue: events drm_fb_helper_damage_work RIP: 0010:memcpy_toio+0x23/0x50
Hardware concerned: CPU: Info: 6-Core model: AMD FX-6300 bits: 64 type: MCP cache: L2: 2 MiB Speed: 1404 MHz min/max: 1400/3500 MHz Core speeds (MHz): 1: 1404 2: 1399 3: 3450 4: 1869 5: 1404 6: 1404 Graphics: Device-1: NVIDIA TU116 [GeForce GTX 1650 SUPER] driver: nvidia v: 470.103.01 Display: x11 server: X.Org 1.21.1.3 driver: loaded: nvidia resolution: 1: 3360x2100~60Hz 2: 3840x2160~60Hz OpenGL: renderer: NVIDIA GeForce GTX 1650 SUPER/PCIe/SSE2 v: 4.6.0 NVIDIA 470.103.01
There's no time for me to look further at this today.
Cheers, Michael
No issue with 5.16.8-1-default and NVidia driver version 510.47.03 here (installed the "hard" way via the NVidia installer and dkms). Cheers, Rodney. -- ================================================================================================================== Rodney Baker rodney.baker@iinet.net.au ==================================================================================================================
On Mon, 2022-02-14 at 22:09 +1030, Rodney Baker wrote:
On Monday, 14 February 2022 3:44:01 PM ACDT Michael Hamilton wrote:
Just a heads up to anyone using nvidia considering dup'ing to 20220212 with kernel 5.16.8-1-default.
After upgrading I cannot boot, not even to run level 3. It may be because I'm using the nvidia proprietary driver. I've not seen it fail to boot to runlevel 3 before, so that's a surprise. The host is not reachable via ssh and it's NFS-exports are not available.
I'm seeing the following errors in the journal (after booting back to 5.16.5-1-default).
BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 233s! .. #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page .. PGD 100001067 P4D 100001067 PUD 1001b5067 PMD 111932067 PTE 0 .. CPU: 1 PID: 459 Comm: kworker/1:2 Tainted: P OE 5.16.8-1-default #1 openSUSE Tumbleweed 257f8f36371552cd38032922fd021edb6811ecdc
Journal Entry 2022-02-14 17:37:17.393907 .. Workqueue: events drm_fb_helper_damage_work RIP: 0010:memcpy_toio+0x23/0x50
Hardware concerned: CPU: Info: 6-Core model: AMD FX-6300 bits: 64 type: MCP cache: L2: 2 MiB Speed: 1404 MHz min/max: 1400/3500 MHz Core speeds (MHz): 1: 1404 2: 1399 3: 3450 4: 1869 5: 1404 6: 1404 Graphics: Device-1: NVIDIA TU116 [GeForce GTX 1650 SUPER] driver: nvidia v: 470.103.01 Display: x11 server: X.Org 1.21.1.3 driver: loaded: nvidia resolution: 1: 3360x2100~60Hz 2: 3840x2160~60Hz OpenGL: renderer: NVIDIA GeForce GTX 1650 SUPER/PCIe/SSE2 v: 4.6.0 NVIDIA 470.103.01
There's no time for me to look further at this today.
Cheers, Michael
No issue with 5.16.8-1-default and NVidia driver version 510.47.03 here (installed the "hard" way via the NVidia installer and dkms).
Cheers, Rodney.
Same here. Using a .run for 510.47.03 I'd downloaded prior, I was able to use my 'curated over time' method to get it installed for 5.16.8-1- default. I generally need to verify/create /usr/src/<new kernel>/scripts/sign-file before I execute the .run file. From there I accept _most_ defaults except that I sign the module w/ an existing private key/public cert, then deny the 32-bit compat libraries. This update went better than most. Last time I had an issue w/ bbswitch deciding to turn off the nvidia card and prime-select refusing to do anything useful aside from reporting bbswitch was loaded though I didn't seem to matter the state of the NVIDIA modules. I seem to have been able to disable bbswitch and get nvidia_drm to load at boot via systemd-modules-load.service ~ though I'm sure next time I boot I'll have to do it manually 😋️ -- ~ Scott Bradnick |- Windows Subsystem for Linux (WSL) Developer |-- Tumbleweed: |--- Dell Precision 5540 [NVIDIA Quadro T1000] (x86_64) |--- O-DROID H2+ [UHD Graphics 600] (x86_64) |--- 2x Raspberry Pi 4 Model B Rev 1.2 (aarch64) |--- WinBook TW100 (x86_64) https://keys.openpgp.org/ :: DBC5AA9A2D2BAEBC
participants (7)
-
Felix Miata
-
Masaru Nomiya
-
Mathias Homann
-
Mathias Homann
-
Michael Hamilton
-
Rodney Baker
-
Scott Bradnick