NVIDIA driver status
I have been wanting to update my Tumbleweed system. However, when I last tried to do so a week or so ago the NVIDIA kernel driver was out of wack. I see that the current Tumbleweed kernel is 5.13.4, while the current NVIDIA kernel module is 5.13.2. So I guess things are not quite working again yet. Right? I have been trying to stay with the RPM NVIDIA stuff instead of installing the .run file. Generally that makes life easier... The main reason I want to update now is that I went to install Virtual Box. And it is available for 5.13.4, which makes sense. -- Roger Oberholtzer
On Mon, Aug 9, 2021 at 12:26 PM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
I have been wanting to update my Tumbleweed system. However, when I last tried to do so a week or so ago the NVIDIA kernel driver was out of wack.
I see that the current Tumbleweed kernel is 5.13.4, while the current NVIDIA kernel module is 5.13.2. So I guess things are not quite working again yet. Right?
No. Unless kABI was broken between 5.13.2 and 5.13.4 (that would rather defeat the very idea of stable update) the module for 5.13.2 should work with 5.13.4.
On Monday 09 August 2021, Roger Oberholtzer wrote:
I have been wanting to update my Tumbleweed system. However, when I last tried to do so a week or so ago the NVIDIA kernel driver was out of wack.
I see that the current Tumbleweed kernel is 5.13.4, while the current NVIDIA kernel module is 5.13.2. So I guess things are not quite working again yet. Right?
I have been trying to stay with the RPM NVIDIA stuff instead of installing the .run file. Generally that makes life easier...
The main reason I want to update now is that I went to install Virtual Box. And it is available for 5.13.4, which makes sense.
I've using the nvidia "easy way" driver, I normally have no problems with minor kernel updates, and everything is working at the moment: nvidia-gfxG05-kmp-default-470.57.02_k5.13.2_1-43.1.x86_64 kernel-default-devel-5.13.6-1.2.x86_64 (I do see recent infrequent X11 hangs with chrome-beta with GPU acceleration enabled, but that also happened with previous kernels. I've emailed nvidia; I've disabled that chrome-beta option.) Major kernel version bumps can sometimes cause an issue, but not always. I often rely on /etc/zypp/zypp.conf multiversion.kernels to dig me out of any surprises there (even going so far as keeping specific extra kernels around as a fallback). Michael
How odd. With this combination I cannot get X to start, But with 5.12.3 for kernel and NVIDIA all is okay. This is the last working combo. modinfo nvidia filename: /lib/modules/5.12.3-1-default/updates/nvidia.ko uname -a Linux acme 5.12.3-1-default #1 SMP Wed May 12 09:01:49 UTC 2021 (25d4ec7) x86_64 x86_64 x86_64 GNU/Linux I'm using the gfxG05 driver for an NVIDIA GeForce GTX 1660 SUPER On Mon, Aug 9, 2021 at 12:30 PM Michael Hamilton <michael@actrix.gen.nz> wrote:
On Monday 09 August 2021, Roger Oberholtzer wrote:
I have been wanting to update my Tumbleweed system. However, when I last tried to do so a week or so ago the NVIDIA kernel driver was out of wack.
I see that the current Tumbleweed kernel is 5.13.4, while the current NVIDIA kernel module is 5.13.2. So I guess things are not quite working again yet. Right?
I have been trying to stay with the RPM NVIDIA stuff instead of installing the .run file. Generally that makes life easier...
The main reason I want to update now is that I went to install Virtual Box. And it is available for 5.13.4, which makes sense.
I've using the nvidia "easy way" driver, I normally have no problems with minor kernel updates, and everything is working at the moment:
nvidia-gfxG05-kmp-default-470.57.02_k5.13.2_1-43.1.x86_64 kernel-default-devel-5.13.6-1.2.x86_64
(I do see recent infrequent X11 hangs with chrome-beta with GPU acceleration enabled, but that also happened with previous kernels. I've emailed nvidia; I've disabled that chrome-beta option.)
Major kernel version bumps can sometimes cause an issue, but not always. I often rely on /etc/zypp/zypp.conf multiversion.kernels to dig me out of any surprises there (even going so far as keeping specific extra kernels around as a fallback).
Michael
-- Roger Oberholtzer
Am Montag, 9. August 2021, 14:31:06 CEST schrieb Roger Oberholtzer:
How odd. With this combination I cannot get X to start, But with 5.12.3 for kernel and NVIDIA all is okay.
This is the last working combo.
modinfo nvidia filename: /lib/modules/5.12.3-1-default/updates/nvidia.ko
uname -a Linux acme 5.12.3-1-default #1 SMP Wed May 12 09:01:49 UTC 2021 (25d4ec7) x86_64 x86_64 x86_64 GNU/Linux
How odd. This is TW 20210805: mathias@mio:~> sudo uname -a Linux mio 5.13.6-1-default #1 SMP Thu Jul 29 04:18:38 UTC 2021 (2d7b44d) x86_64 x86_64 x86_64 GNU/Linux mathias@mio:~> sudo modinfo nvidia|grep file filename: /lib/modules/5.13.6-1-default/updates/nvidia.ko mathias@mio:~> inxi -G Graphics: Device-1: Intel TigerLake-LP GT2 [Iris Xe Graphics] driver: i915 v: kernel Device-2: NVIDIA TU117M [GeForce GTX 1650 Ti Mobile] driver: nvidia v: 470.57.02 Device-3: Acer HD Camera type: USB driver: uvcvideo Display: x11 server: X.Org 1.20.13 driver: loaded: modesetting,nvidia resolution: 1920x1080 OpenGL: renderer: Mesa Intel Xe Graphics (TGL GT2) v: 4.6 Mesa 21.1.6 mathias@mio:~> gamemoderun inxi -G gamemodeauto: gamemodeauto: gamemodeauto: Graphics: Device-1: Intel TigerLake-LP GT2 [Iris Xe Graphics] driver: i915 v: kernel Device-2: NVIDIA TU117M [GeForce GTX 1650 Ti Mobile] driver: nvidia v: 470.57.02 Device-3: Acer HD Camera type: USB driver: uvcvideo Display: x11 server: X.Org 1.20.13 driver: loaded: modesetting,nvidia resolution: 1920x1080 OpenGL: renderer: NVIDIA GeForce GTX 1650 Ti with Max-Q Design/ PCIe/SSE2 v: 4.6.0 NVIDIA 470.57.02 -- Mathias Homann Mathias.Homann@openSUSE.org OBS: lemmy04 Jabber (XMPP): lemmy@tuxonline.tech IRC: [Lemmy] on freenode and ircnet (bouncer active) telegram: https://telegram.me/lemmy98 keybase: https://keybase.io/lemmy gpg key fingerprint: 8029 2240 F4DD 7776 E7D2 C042 6B8E 029E 13F2 C102
On 8/9/21 7:31 AM, Roger Oberholtzer wrote:
How odd. With this combination I cannot get X to start, But with 5.12.3 for kernel and NVIDIA all is okay.
This is the last working combo.
I have one system with an nVidia card. I just updated it from kernel 5.13.4 to 5.13.6. The new VirtualBox module is open source, thus it is part of the regular build. I do not usually watch the updates on that machine, but I did so today. When zypper installed the new kernel, a build of the nVidia module followed. That was essentially the same build that would have been used with the .run method, but without the clunky, non X, screens. The system rebooted with X running normally. Larry
I'm using GeForce GTX 1650 SUPER (on AMD FX-6300 ASUS M5A97 EVO). When I've had problems in the past, sometimes a forced reinstall of the module or the entire set of nvidia installables sorted things out. From my admin notes: - Nvidia - in the event the kernel module needs a manual rebuild: - run zypper in -f nvidia-gfxG05-kmp-default - if that's not enough force a reinstall of all nvidia repo packages zypper --no-refresh se --installed-only -r NVIDIA | awk '$1 == "i+" { print $3 }' | xargs zypper in --force where NVIDIA is the name of the nvidia repo, should result in something like: zypper in --force nvidia-computeG05 nvidia-gfxG05-kmp-default nvidia-glG05 x11-video-nvidiaG05 (There's probably a simpler way to do this.) The last time I had any issues it was due to the usr merge. See https://en.opensuse.org/openSUSE:Usr_merge I still have the workaround link noted at the above page in place: cd /usr; ln -s . /usr/usr But I don't think the workaround should be necessary anymore. Michael On Tuesday 10 August 2021, Roger Oberholtzer wrote:
How odd. With this combination I cannot get X to start, But with 5.12.3 for kernel and NVIDIA all is okay.
This is the last working combo.
modinfo nvidia filename: /lib/modules/5.12.3-1-default/updates/nvidia.ko
uname -a Linux acme 5.12.3-1-default #1 SMP Wed May 12 09:01:49 UTC 2021 (25d4ec7) x86_64 x86_64 x86_64 GNU/Linux
I'm using the gfxG05 driver for an NVIDIA GeForce GTX 1660 SUPER
On Mon, Aug 9, 2021 at 12:30 PM Michael Hamilton <michael@actrix.gen.nz> wrote:
On Monday 09 August 2021, Roger Oberholtzer wrote:
I have been wanting to update my Tumbleweed system. However, when I last tried to do so a week or so ago the NVIDIA kernel driver was out of wack.
I see that the current Tumbleweed kernel is 5.13.4, while the current NVIDIA kernel module is 5.13.2. So I guess things are not quite working again yet. Right?
I have been trying to stay with the RPM NVIDIA stuff instead of installing the .run file. Generally that makes life easier...
The main reason I want to update now is that I went to install Virtual Box. And it is available for 5.13.4, which makes sense.
I've using the nvidia "easy way" driver, I normally have no problems with minor kernel updates, and everything is working at the moment:
nvidia-gfxG05-kmp-default-470.57.02_k5.13.2_1-43.1.x86_64 kernel-default-devel-5.13.6-1.2.x86_64
(I do see recent infrequent X11 hangs with chrome-beta with GPU acceleration enabled, but that also happened with previous kernels. I've emailed nvidia; I've disabled that chrome-beta option.)
Major kernel version bumps can sometimes cause an issue, but not always. I often rely on /etc/zypp/zypp.conf multiversion.kernels to dig me out of any surprises there (even going so far as keeping specific extra kernels around as a fallback).
Michael
As I wrote elsewhere in this thread, after further investigation, I think my problem is actually this: https://forums.opensuse.org/showthread.php/543847-nvidia-gpu-i2c-timeout-err... So now I need to see when the patch that is mentioned gets in the Tumbleweed kernel. On Mon, Aug 9, 2021 at 10:25 PM Michael Hamilton <michael@actrix.gen.nz> wrote:
I'm using GeForce GTX 1650 SUPER (on AMD FX-6300 ASUS M5A97 EVO).
When I've had problems in the past, sometimes a forced reinstall of the module or the entire set of nvidia installables sorted things out. From my admin notes:
- Nvidia - in the event the kernel module needs a manual rebuild: - run zypper in -f nvidia-gfxG05-kmp-default - if that's not enough force a reinstall of all nvidia repo packages zypper --no-refresh se --installed-only -r NVIDIA | awk '$1 == "i+" { print $3 }' | xargs zypper in --force where NVIDIA is the name of the nvidia repo, should result in something like: zypper in --force nvidia-computeG05 nvidia-gfxG05-kmp-default nvidia-glG05 x11-video-nvidiaG05
(There's probably a simpler way to do this.)
The last time I had any issues it was due to the usr merge. See
https://en.opensuse.org/openSUSE:Usr_merge
I still have the workaround link noted at the above page in place:
cd /usr; ln -s . /usr/usr
But I don't think the workaround should be necessary anymore.
Michael
On Tuesday 10 August 2021, Roger Oberholtzer wrote:
How odd. With this combination I cannot get X to start, But with 5.12.3 for kernel and NVIDIA all is okay.
This is the last working combo.
modinfo nvidia filename: /lib/modules/5.12.3-1-default/updates/nvidia.ko
uname -a Linux acme 5.12.3-1-default #1 SMP Wed May 12 09:01:49 UTC 2021 (25d4ec7) x86_64 x86_64 x86_64 GNU/Linux
I'm using the gfxG05 driver for an NVIDIA GeForce GTX 1660 SUPER
On Mon, Aug 9, 2021 at 12:30 PM Michael Hamilton <michael@actrix.gen.nz> wrote:
On Monday 09 August 2021, Roger Oberholtzer wrote:
I have been wanting to update my Tumbleweed system. However, when I last tried to do so a week or so ago the NVIDIA kernel driver was out of wack.
I see that the current Tumbleweed kernel is 5.13.4, while the current NVIDIA kernel module is 5.13.2. So I guess things are not quite working again yet. Right?
I have been trying to stay with the RPM NVIDIA stuff instead of installing the .run file. Generally that makes life easier...
The main reason I want to update now is that I went to install Virtual Box. And it is available for 5.13.4, which makes sense.
I've using the nvidia "easy way" driver, I normally have no problems with minor kernel updates, and everything is working at the moment:
nvidia-gfxG05-kmp-default-470.57.02_k5.13.2_1-43.1.x86_64 kernel-default-devel-5.13.6-1.2.x86_64
(I do see recent infrequent X11 hangs with chrome-beta with GPU acceleration enabled, but that also happened with previous kernels. I've emailed nvidia; I've disabled that chrome-beta option.)
Major kernel version bumps can sometimes cause an issue, but not always. I often rely on /etc/zypp/zypp.conf multiversion.kernels to dig me out of any surprises there (even going so far as keeping specific extra kernels around as a fallback).
Michael
-- Roger Oberholtzer
Perhaps no longer interesting, but I updated my Tumbleweed system and now the latest kernel and nvidia drivers are working. No need to boot with an older kernel. I know it makes no sense, but I'm convinced that the current Tumbleweed is running faster on my system. Or at least the KDE desktop seems snappier. Apps seem to be starting faster. No idea if this is actually the case... On Tue, Aug 10, 2021 at 8:43 AM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
As I wrote elsewhere in this thread, after further investigation, I think my problem is actually this:
https://forums.opensuse.org/showthread.php/543847-nvidia-gpu-i2c-timeout-err...
So now I need to see when the patch that is mentioned gets in the Tumbleweed kernel.
On Mon, Aug 9, 2021 at 10:25 PM Michael Hamilton <michael@actrix.gen.nz> wrote:
I'm using GeForce GTX 1650 SUPER (on AMD FX-6300 ASUS M5A97 EVO).
When I've had problems in the past, sometimes a forced reinstall of the module or the entire set of nvidia installables sorted things out. From my admin notes:
- Nvidia - in the event the kernel module needs a manual rebuild: - run zypper in -f nvidia-gfxG05-kmp-default - if that's not enough force a reinstall of all nvidia repo packages zypper --no-refresh se --installed-only -r NVIDIA | awk '$1 == "i+" { print $3 }' | xargs zypper in --force where NVIDIA is the name of the nvidia repo, should result in something like: zypper in --force nvidia-computeG05 nvidia-gfxG05-kmp-default nvidia-glG05 x11-video-nvidiaG05
(There's probably a simpler way to do this.)
The last time I had any issues it was due to the usr merge. See
https://en.opensuse.org/openSUSE:Usr_merge
I still have the workaround link noted at the above page in place:
cd /usr; ln -s . /usr/usr
But I don't think the workaround should be necessary anymore.
Michael
On Tuesday 10 August 2021, Roger Oberholtzer wrote:
How odd. With this combination I cannot get X to start, But with 5.12.3 for kernel and NVIDIA all is okay.
This is the last working combo.
modinfo nvidia filename: /lib/modules/5.12.3-1-default/updates/nvidia.ko
uname -a Linux acme 5.12.3-1-default #1 SMP Wed May 12 09:01:49 UTC 2021 (25d4ec7) x86_64 x86_64 x86_64 GNU/Linux
I'm using the gfxG05 driver for an NVIDIA GeForce GTX 1660 SUPER
On Mon, Aug 9, 2021 at 12:30 PM Michael Hamilton <michael@actrix.gen.nz> wrote:
On Monday 09 August 2021, Roger Oberholtzer wrote:
I have been wanting to update my Tumbleweed system. However, when I last tried to do so a week or so ago the NVIDIA kernel driver was out of wack.
I see that the current Tumbleweed kernel is 5.13.4, while the current NVIDIA kernel module is 5.13.2. So I guess things are not quite working again yet. Right?
I have been trying to stay with the RPM NVIDIA stuff instead of installing the .run file. Generally that makes life easier...
The main reason I want to update now is that I went to install Virtual Box. And it is available for 5.13.4, which makes sense.
I've using the nvidia "easy way" driver, I normally have no problems with minor kernel updates, and everything is working at the moment:
nvidia-gfxG05-kmp-default-470.57.02_k5.13.2_1-43.1.x86_64 kernel-default-devel-5.13.6-1.2.x86_64
(I do see recent infrequent X11 hangs with chrome-beta with GPU acceleration enabled, but that also happened with previous kernels. I've emailed nvidia; I've disabled that chrome-beta option.)
Major kernel version bumps can sometimes cause an issue, but not always. I often rely on /etc/zypp/zypp.conf multiversion.kernels to dig me out of any surprises there (even going so far as keeping specific extra kernels around as a fallback).
Michael
-- Roger Oberholtzer
-- Roger Oberholtzer
Am Montag, 9. August 2021, 11:26:21 CEST schrieb Roger Oberholtzer:
I have been wanting to update my Tumbleweed system. However, when I last tried to do so a week or so ago the NVIDIA kernel driver was out of wack.
I see that the current Tumbleweed kernel is 5.13.4, while the current NVIDIA kernel module is 5.13.2. So I guess things are not quite working again yet. Right?
I've been running TW with extremely frequent updates on two systems with nvidia graphics since spring, and my main (leisure) use case is games... and so far I haven't encountered any problems. Cheers MH -- Mathias Homann Mathias.Homann@openSUSE.org OBS: lemmy04 Jabber (XMPP): lemmy@tuxonline.tech IRC: [Lemmy] on freenode and ircnet (bouncer active) telegram: https://telegram.me/lemmy98 keybase: https://keybase.io/lemmy gpg key fingerprint: 8029 2240 F4DD 7776 E7D2 C042 6B8E 029E 13F2 C102
I think my problem is actually this: https://forums.opensuse.org/showthread.php/543847-nvidia-gpu-i2c-timeout-err... It references a kernel patch. It question is if it is in the Tumbleweed kernel 5.13.4 or 5.13.6. I In the setup that works, I see this in the boot log: Aug 09 14:21:35 acme kernel: nvidia-gpu 0000:01:00.3: enabling device (0000 -> 0002) Aug 09 14:21:35 acme kernel: nvidia: loading out-of-tree module taints kernel. Aug 09 14:21:35 acme kernel: nvidia: module license 'NVIDIA' taints kernel. Aug 09 14:21:35 acme kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel Aug 09 14:21:35 acme kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 240 Aug 09 14:21:35 acme kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none Aug 09 14:21:35 acme kernel: nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint. Aug 09 14:21:35 acme kernel: nvidia-uvm: Loaded the UVM driver, major device number 238. Aug 09 14:21:35 acme kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 470.57.02 Tue Jul 13 16:06:24 UTC 2021 Aug 09 14:21:35 acme kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver Aug 09 14:21:36 acme kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0 Aug 09 14:21:37 acme kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000 Aug 09 14:21:39 acme kernel: audit: type=1400 audit(1628511698.780:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=1222 comm="appar mor_parser" Aug 09 14:21:39 acme kernel: audit: type=1400 audit(1628511698.780:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1222 comm= "apparmor_parser" Aug 09 14:24:15 acme dbus-daemon[2059]: [session uid=1000 pid=2059] Activating via systemd: service name='org.gtk.vfs.Daemon' unit='gvfs-daemon.service' requested by ':1.2' (uid= 1000 pid=2076 comm="nvidia-settings --load-config-only ") Aug 09 14:24:15 acme dbus-daemon[2059]: [session uid=1000 pid=2059] Activating via systemd: service name='org.a11y.Bus' unit='at-spi-dbus-bus.service' requested by ':1.9' (uid=10 00 pid=2076 comm="nvidia-settings --load-config-only ") In the one that does not, I see this: Aug 09 14:14:59 acme kernel: nvidia-gpu 0000:01:00.3: enabling device (0000 -> 0002) Aug 09 14:15:01 acme kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000 Aug 09 14:15:03 acme kernel: audit: type=1400 audit(1628511303.468:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=1209 comm="appar mor_parser" Aug 09 14:15:03 acme kernel: audit: type=1400 audit(1628511303.468:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1209 comm= "apparmor_parser" So perhaps the problem is before the nvidia driver is even loaded. On Mon, Aug 9, 2021 at 2:37 PM Mathias Homann <Mathias.Homann@opensuse.org> wrote:
Am Montag, 9. August 2021, 11:26:21 CEST schrieb Roger Oberholtzer:
I have been wanting to update my Tumbleweed system. However, when I last tried to do so a week or so ago the NVIDIA kernel driver was out of wack.
I see that the current Tumbleweed kernel is 5.13.4, while the current NVIDIA kernel module is 5.13.2. So I guess things are not quite working again yet. Right?
I've been running TW with extremely frequent updates on two systems with nvidia graphics since spring, and my main (leisure) use case is games... and so far I haven't encountered any problems.
Cheers MH
-- Mathias Homann Mathias.Homann@openSUSE.org OBS: lemmy04 Jabber (XMPP): lemmy@tuxonline.tech IRC: [Lemmy] on freenode and ircnet (bouncer active) telegram: https://telegram.me/lemmy98 keybase: https://keybase.io/lemmy gpg key fingerprint: 8029 2240 F4DD 7776 E7D2 C042 6B8E 029E 13F2 C102
-- Roger Oberholtzer
participants (5)
-
Andrei Borzenkov
-
Larry Finger
-
Mathias Homann
-
Michael Hamilton
-
Roger Oberholtzer