[Bug 802624] New: Switching from nvidia-gfxG02 to nvidia-gfxG03 leads to the nouveau driver being used
https://bugzilla.novell.com/show_bug.cgi?id=802624 https://bugzilla.novell.com/show_bug.cgi?id=802624#c0 Summary: Switching from nvidia-gfxG02 to nvidia-gfxG03 leads to the nouveau driver being used Classification: openSUSE Product: openSUSE 12.2 Version: Final Platform: x86-64 OS/Version: openSUSE 12.2 Status: NEW Severity: Normal Priority: P5 - None Component: Other AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: robert.munteanu@gmail.com QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0 I noticed that the nvidia-gfxG03 driver is available so I fired up YAST, removed all the nvidia-gfxG02 references and added the nvidia-gfxG03 replacements. After a reboot Gnome started in fallback mode. I noticed that nouveau was in use. The /lib/modules/3.4.6-2.10-desktop/updates/nvidia.ko file was in place and manually insmod-ing it loaded the driver and after a logout the new module was in place. What I did notice though ( I'm running kernel 3.4.11-2.16-desktop ) that there is no symlink for the driver in /lib/modules/3.4.11-2.16-desktop/weak-updates/updates . Reproducible: Didn't try -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c1
Carl Fletcher
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c2
--- Comment #2 from Robert Munteanu
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c3
--- Comment #3 from Robert Munteanu
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c4
Carl Fletcher
Some more information
* it's not the nouveau driver that's being loaded,
Well that's what I see as loaded Regardless if I use nvidia 02 or 03 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c5
Wolfgang Bauer
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c6
--- Comment #6 from Carl Fletcher
I had the same problem today.
For some reason, Yast first installed G03 and afterwards removed G02 including the nouveau blacklist. Therefore on reboot the nouveau kernel module was loaded which prevented the loading of the nvidia driver.
Workaround: - Remove the driver completely with Yast - Start Yast a second time and install your driver of choice
Now everything should work again...
This does work. It seemed to drag in some unnecessary crud in the process. Something is pretty wrong IMO This is just a workaround -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c7
--- Comment #7 from Wolfgang Bauer
It seemed to drag in some unnecessary crud in the process.
Didn't happen here, only nvidia-computeG03, x11-video-nvidiaG03 and nvidia-gfxG03-kmp-desktop got installed. What unnecessary crud do you mean btw.? If it's gcc and its dependencies then that's needed because the kernel module is now compiled on installation.
Something is pretty wrong IMO This is just a workaround
Well, AFAICT the only thing that does not work is directly switching from G02 to G03 or vice versa. If you uninstall first and then install the other one everything works as expected. It did for me at least. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c8
--- Comment #8 from Robert Munteanu
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c
Du Weihua
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c9
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c10
Robert Munteanu
Seems you now have nvidia-gfxG03-kmp and nvidia-gfxG02-kmp installed, which results in that issue.
Could you confirm? Please add the output of 'rpm -qa|grep nvidia".
$ rpm -qa | grep nvidia nvidia-gfxG03-kmp-desktop-310.32_k3.4.6_2.10-5.1.x86_64 x11-video-nvidiaG03-310.32-5.1.x86_64 nvidia-computeG03-310.32-5.1.x86_64 I cleaned up my installation quite some time ago, but I don't recall seeing multiple driver versions installed at the same time. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c11
Stefan Dirsch
$ rpm -qa | grep nvidia nvidia-gfxG03-kmp-desktop-310.32_k3.4.6_2.10-5.1.x86_64 x11-video-nvidiaG03-310.32-5.1.x86_64 nvidia-computeG03-310.32-5.1.x86_64
And with that you still see the issue? Please check also the output of rpm -V $(rpm -qa | grep nvidia) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c12
Robert Munteanu
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c13
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c14
Wolfgang Bauer
Ok. Although the workaround works, I believe the explanation is wrong. Most likely you've ended up in having the wrong kernel module installed and loaded (G03 one) and still running the G02 X driver. The file for blacklisting the nouveau module is in both packages, so this would not have been deleted by RPM.
I reproduced the issue just now. Yes, you are right, the blacklist file does not get removed. (that was a guess, sorry) But I know now what is going wrong: As I wrote in comment#5, YaST _first_ installs the new packages and _then_ removes the old ones afterwards. In removing the old packages, also the link from /lib/modules/3.4.28-2.20-desktop/weak-updates/updates/nvidia.ko to /lib/modules/3.4.6-2.10-desktop/updates/nvidia.ko (where the kernel module is installed) gets removed. Therefore the system doesn't find the nvidia kernel module any more and the nvidia driver cannot be loaded. On a side note: When I select the X11 driver for installation, the kernel module for kernel-default gets selected automatically although I am running kernel-desktop. I have to manually select the right kernel module. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c15
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c16
--- Comment #16 from Wolfgang Bauer
Yeah. Seems in %postun of the old KMP the weak-update symlink to the installed kernel module gets removed. The issue is, that the %postun of the replaced KMP runs after %post of the new KMP. How to address that? No idea right now.
Since %postun is run _after_ the packaged files are removed, you could maybe check if nvidia.ko exists and only remove the link if it doesn't? (Just an idea though, not sure if that would work...)
But then we would have run into this issue each time one just updated the KMP. But we didn't, did we?
Well, on package updates %postun is not run, so this issue cannot happen. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c17
--- Comment #17 from Stefan Dirsch
(In reply to comment #15)
Yeah. Seems in %postun of the old KMP the weak-update symlink to the installed kernel module gets removed. The issue is, that the %postun of the replaced KMP runs after %post of the new KMP. How to address that? No idea right now.
Since %postun is run _after_ the packaged files are removed, you could maybe check if nvidia.ko exists and only remove the link if it doesn't? (Just an idea though, not sure if that would work...)
Well, the code, which does this is hardcoded im KMP framework. :-(
But then we would have run into this issue each time one just updated the KMP. But we didn't, did we?
Well, on package updates %postun is not run, so this issue cannot happen.
Sure it *is* running. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c18
--- Comment #18 from Wolfgang Bauer
Sure it *is* running.
When "updating" to the same version, %postun is not run (that's what I tried before writing this and I thought this was always the case). But yes, on update to a different version it _is_ running (after %post). Anyway, I never had an issue during update, only when switching between G02 and G03. So it seems /usr/lib/module-init-tools/weak-modules2 is handling the update case correctly. But I guess in the switching case it gets confused because the filenames of the kernel modules are the same but the package names are different (nvidia-gfxG02 vs nvidia-gfxG03)... So maybe this should be reported as bug against module-init-tools then? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c19
--- Comment #19 from Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c20
--- Comment #20 from Wolfgang Bauer
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c21
--- Comment #21 from Wolfgang Bauer
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c22
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c23
--- Comment #23 from Stefan Dirsch
Another thing I just noticed: When _no_ nvidia package is installed and I select x11-video-nvidiaG03 for installation, BOTH nvidia-gfxG02-kmp-default AND nvidia-gfxG03-default get selected automatically for installation (and kernel-default).
I don't see this issue (with only kernel-desktop installed). But I am testing on x86_64. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c24
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c26
--- Comment #26 from Armin Herbert
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c27
--- Comment #27 from Michal Marek
For some reason I don't see the issues when just updating nvidia-gfxG03-kmp-desktop package. I would have expected the same issue when looking at the %preun/%postun scripts.
Did you test with the kernel that the KMP was built against? In this case the module is installed below /lib/modules/.../extra or .../updates and this is handled by rpm. Admittedly, rpm is lot smarter than the weak-modules2 script.
Michal, could it be that wm2 removes the symlink although the kernel module still exists, but only does that if it belongs to a different package? This would explain the behaviour.
Yes. weak-modules2 --remove-kmp always removes all the symlinks, and then looks for the newest among other versions of the kmp to create new symlinks. I.e. package renames are not handled. I will change it to also consider packages that contain the same set modules, which should handle the nvidia case. It will not handle the case when you rename a KMP _and_ add or remove modules at the same time. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c
Michal Marek
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c28
--- Comment #28 from Stefan Dirsch
First of all, I'm sorry for the delay.
(In reply to comment #22)
For some reason I don't see the issues when just updating nvidia-gfxG03-kmp-desktop package. I would have expected the same issue when looking at the %preun/%postun scripts.
Did you test with the kernel that the KMP was built against?
No, the kernel has been updated meanwhile. IIRC updating the kernel was necessary to reproduce the issue.
In this case the module is installed below /lib/modules/.../extra or .../updates and this is handled by rpm. Admittedly, rpm is lot smarter than the weak-modules2 script.
Michal, could it be that wm2 removes the symlink although the kernel module still exists, but only does that if it belongs to a different package? This would explain the behaviour.
Yes. weak-modules2 --remove-kmp always removes all the symlinks, and then looks for the newest among other versions of the kmp to create new symlinks. I.e. package renames are not handled. I will change it to also consider packages that contain the same set modules, which should handle the nvidia case. It will not handle the case when you rename a KMP _and_ add or remove modules at the same time.
Thanks. That would be great and covers perfectly this issue only containing one kernel module called "nvidia.ko". -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c29
--- Comment #29 from Wolfgang Bauer
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c30
--- Comment #30 from Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c31
--- Comment #31 from Wolfgang Bauer
(In reply to comment #21)
Another thing I just noticed: When _no_ nvidia package is installed and I select x11-video-nvidiaG03 for installation, BOTH nvidia-gfxG02-kmp-default AND nvidia-gfxG03-default get selected automatically for installation (and kernel-default).
I don't see this issue (with only kernel-desktop installed). But I am testing on x86_64. Just for your information: I found out why this happens on my system (at least why nvidia-gfxG03-default gets selected instead of *-desktop; the G02 module doesn't get selected anymore but I guess the reason was related):
When I uninstall the nvidia packages, they get added to /var/lib/zypp/SoftLocks. If I remove that file before I try to install x11-video-nvidiaG03 again, the correct kmp gets selected! (nvidia-gfxG03-kmp-desktop) I think because of the soft lock on nvidia-gfxG03-kmp-desktop, the solver prefers to install nvidia-gfxG03-kmp-default, even though kernel-desktop is installed and not kernel-default. And I think that's intended behaviour of the solver, because otherwise when you uninstall a recommended package it will get installed again automatically some time later. But at least the result is unexpected in this case IMHO. Anyway, that's no bug in the nvidia driver packages... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c32
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c33
--- Comment #33 from Wolfgang Bauer
When trying to install nvidia-computeG03 and x11-video-nvidiaG03 with already installed nvidia-computeG02 and x11-video-nvidiaG02 packages this fails due to file conflicts (which is good). That's not true.
YaST/zypper will happily install both side-by-side and there won't be a file conflict error because libzypp uses "rpm --force" to install packages AFAIK. (In reply to comment #31)
the G02 module doesn't get selected anymore but I guess the reason was related I could reproduce the issue that BOTH G02 and G03 kernel modules get installed.
This happens when I try to install the G03 driver with SoftLocks on nvidia-gfxG02-kmp-desktop and nvidia-gfxG03-kmp-desktop. Then BOTH nvidia-gfxG02-kmp-default and nvidia-gfxG03-kmp-default get selected (with kernel-desktop installed). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c34
--- Comment #34 from Wolfgang Bauer
This happens when I try to install the G03 driver with SoftLocks on nvidia-gfxG02-kmp-desktop and nvidia-gfxG03-kmp-desktop. Then BOTH nvidia-gfxG02-kmp-default and nvidia-gfxG03-kmp-default get selected (with kernel-desktop installed).
Sorry that was a typo. It should read "This happens when I try to install the G02 driver" -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c35
--- Comment #35 from Stefan Dirsch
Ok. Let's keep that one open as long as Michal didn't confirm that he adjusted weak-updates2 script.
Michal? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c36
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=802624
https://bugzilla.novell.com/show_bug.cgi?id=802624#c37
--- Comment #37 from Michal Marek
participants (1)
-
bugzilla_noreply@novell.com