Unable to load nvidia-gfxG05-kmp-default-470.129.06: Unknown symbol __x86_return_thunk
![](https://seccdn.libravatar.org/avatar/8a4d6f03a8879432d8563aefbf48e787.jpg?s=120&d=mm&r=g)
Hi, I just updated the NVIDIA drivers (without checking that I have backups of the old RPMs :-( ) and now, the driver won't load because of: nvidia: Unknown symbol __x86_return_thunk (err -2) 1. How do I fix this? This string isn't used anywhere in the sources: $ grep -r thunk /usr/src/kernel-modules/nvidia-470.129.06-default /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".weak __x86_indirect_thunk_" #REG ";" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".type __x86_indirect_thunk_" #REG ", @function;" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: "__x86_indirect_thunk_" #REG ":" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".size __x86_indirect_thunk_" #REG ", .-__x86_indirect_thunk_" #REG) /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".weak __x86_indirect_thunk_" #REG ";" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".type __x86_indirect_thunk_" #REG ", @function;" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: "__x86_indirect_thunk_" #REG ":" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".size __x86_indirect_thunk_" #REG ", .-__x86_indirect_thunk_" #REG) Übereinstimmungen in Binärdatei /usr/src/kernel-modules/nvidia-470.129.06-default/nvidia/nv-kernel.o_binary Übereinstimmungen in Binärdatei /usr/src/kernel-modules/nvidia-470.129.06-default/nvidia-modeset/nv-modeset-kernel.o_binary 2. Is there a place where I can find version 470.103.01 because that one still worked? Or would you recommend that I downgrade the kernel 5.3.18? Regards, -- Aaron "Optimizer" Digulla a.k.a. Philmann Dark "It's not the universe that's limited, it's our imagination. Follow me and I'll show you something beyond the limits." http://blog.pdark.de/
![](https://seccdn.libravatar.org/avatar/abdee805d4df05af9a496107100c582c.jpg?s=120&d=mm&r=g)
* Aaron Digulla <digulla@hepe.com> [07-26-22 12:41]:
Hi,
I just updated the NVIDIA drivers (without checking that I have backups of the old RPMs :-( ) and now, the driver won't load because of:
nvidia: Unknown symbol __x86_return_thunk (err -2)
1. How do I fix this? This string isn't used anywhere in the sources:
$ grep -r thunk /usr/src/kernel-modules/nvidia-470.129.06-default /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".weak __x86_indirect_thunk_" #REG ";" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".type __x86_indirect_thunk_" #REG ", @function;" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: "__x86_indirect_thunk_" #REG ":" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".size __x86_indirect_thunk_" #REG ", .-__x86_indirect_thunk_" #REG) /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".weak __x86_indirect_thunk_" #REG ";" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".type __x86_indirect_thunk_" #REG ", @function;" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: "__x86_indirect_thunk_" #REG ":" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".size __x86_indirect_thunk_" #REG ", .-__x86_indirect_thunk_" #REG) Übereinstimmungen in Binärdatei /usr/src/kernel-modules/nvidia-470.129.06-default/nvidia/nv-kernel.o_binary Übereinstimmungen in Binärdatei /usr/src/kernel-modules/nvidia-470.129.06-default/nvidia-modeset/nv-modeset-kernel.o_binary
2. Is there a place where I can find version 470.103.01 because that one still worked? Or would you recommend that I downgrade the kernel 5.3.18?
don't "downgrade", but boot the previous working kernel and use it until there is an nvidia driver update, or install the NVIDIA-Linux-x86_64-470.103.01.run package from https://us.download.nvidia.com/XFree86/Linux-x86_64/470.103.01/NVIDIA-Linux-... I generally keep three previous sets of the nvidia drivers for cases as you describe. -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.org openSUSE Community Member facebook/ptilopteri Photos: http://wahoo.no-ip.org/piwigo paka @ IRCnet oftc
![](https://seccdn.libravatar.org/avatar/8a4d6f03a8879432d8563aefbf48e787.jpg?s=120&d=mm&r=g)
Hi Patrick, Thanks for the answer.
Hi,
I just updated the NVIDIA drivers (without checking that I have backups of the old RPMs :-( ) and now, the driver won't load because of:
nvidia: Unknown symbol __x86_return_thunk (err -2)
1. How do I fix this? This string isn't used anywhere in the sources:
$ grep -r thunk /usr/src/kernel-modules/nvidia-470.129.06-default /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".weak __x86_indirect_thunk_" #REG ";" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".type __x86_indirect_thunk_" #REG ", @function;" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: "__x86_indirect_thunk_" #REG ":" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".size __x86_indirect_thunk_" #REG ", .-__x86_indirect_thunk_" #REG) /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".weak __x86_indirect_thunk_" #REG ";" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".type __x86_indirect_thunk_" #REG ", @function;" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: "__x86_indirect_thunk_" #REG ":" \ /usr/src/kernel-modules/nvidia-470.129.06-default/common/inc/nv-retpoline.h: ".size __x86_indirect_thunk_" #REG ", .-__x86_indirect_thunk_" #REG) Übereinstimmungen in Binärdatei /usr/src/kernel-modules/nvidia-470.129.06-default/nvidia/nv-kernel.o_binary Übereinstimmungen in Binärdatei /usr/src/kernel-modules/nvidia-470.129.06-default/nvidia-modeset/nv-modeset-kernel.o_binary
2. Is there a place where I can find version 470.103.01 because that one still worked? Or would you recommend that I downgrade the kernel 5.3.18? don't "downgrade", but boot the previous working kernel and use it until
* Aaron Digulla <digulla@hepe.com> [07-26-22 12:41]: there is an nvidia driver update,
Okay, it took some digging but I start to understand what you mean. According to snapper, nvidia 470.129 was first installed 2022-06-12. In /.snapshots/345/snapshot/boot, I can see a 5.14...22 kernel (vmlinux-5.14.21-150400.22-default.gz) appear on that date. That combination worked until Friday. $ rpm -qa kernel-default --last kernel-default-5.14.21-150400.24.11.1.x86_64 Di 26 Jul 2022 18:28:38 CEST kernel-default-5.14.21-150400.22.1.x86_64 So 12 Jun 2022 12:46:51 CEST Looks like I got a new 5.14...24 kernel today. That one seems to break the NVIDIA driver. The thing that I don't understand yet is this:
uname -a Linux silent5 5.14.21-150400.22-default #1 SMP PREEMPT_DYNAMIC
-rw-r--r-- 1 root root 16548932 May 12 02:35 /boot/vmlinux-5.14.21-150400.22-default.gz So I'm still running the old kernel but with the last zypper up, something broke? Maybe GCC? Why is there a "PREEMPT_DYNAMIC" in uname -a? Hmmm... there is a symtypes-5.3.18-150300.59.68-preempt.gz that was removed in one of the latest snapshots but no vmlinux* for it. yast told me that kernel-preempt-devel was installed for some reason and offered to delete it. Maybe that was it... reboot... *CONNECTION LOST* Regards, -- Aaron "Optimizer" Digulla a.k.a. Philmann Dark "It's not the universe that's limited, it's our imagination. Follow me and I'll show you something beyond the limits." http://blog.pdark.de/
![](https://seccdn.libravatar.org/avatar/8a4d6f03a8879432d8563aefbf48e787.jpg?s=120&d=mm&r=g)
Am 26.07.22 um 20:33 schrieb Aaron Digulla: Update: There were two problems. The first one was that I had kernel-preempt-devel installed. For some reason, building the nvidia driver used those headers instead of the ones of the running kernel. I assume that booting the kernel doesn't change the link /usr/src/linux. Correct? The next thing was I tried to boot the old 5.14.21...22 kernel. That worked but when I compile the nvidia driver, it used the kernel headers for 5.14.21...24. Those contain the mysterious symbol __x86_return_thunk in /usr/src/linux-5.14.21-150400.24.11/arch/x86/include/asm/linkage.h /usr/src/linux-5.14.21-150400.24.11/arch/x86/include/asm/nospec-branch.h /usr/src/linux-5.14.21-150400.24.11/arch/x86/include/asm/static_call.h /usr/src/linux-5.14.21-150400.24.11-obj/x86_64/default/Module.symvers Sidenote: When building the driver with the kernel headers for 5.14.21...24 and running the same kernel, the build process still reports that those symbols are missing: depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-modeset.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-peermem.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-drm.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-uvm.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia.ko needs unknown symbol __x86_return_thunk Warning: /lib/modules/5.14.21-150400.22-default is inconsistent The reason is that this is for the .22 kernel (not the running .24 kernel). So this is a bit confusing. Anyway, the system is running again. Thanks to Patrick Shanahanfor putting me on the right track. Regards, -- Aaron "Optimizer" Digulla a.k.a. Philmann Dark "It's not the universe that's limited, it's our imagination. Follow me and I'll show you something beyond the limits." http://blog.pdark.de/
![](https://seccdn.libravatar.org/avatar/abdee805d4df05af9a496107100c582c.jpg?s=120&d=mm&r=g)
* Aaron Digulla <digulla@hepe.com> [07-26-22 15:01]:
Am 26.07.22 um 20:33 schrieb Aaron Digulla:
Update: There were two problems.
The first one was that I had kernel-preempt-devel installed. For some reason, building the nvidia driver used those headers instead of the ones of the running kernel.
I assume that booting the kernel doesn't change the link /usr/src/linux. Correct?
The next thing was I tried to boot the old 5.14.21...22 kernel. That worked but when I compile the nvidia driver, it used the kernel headers for 5.14.21...24. Those contain the mysterious symbol __x86_return_thunk in
/usr/src/linux-5.14.21-150400.24.11/arch/x86/include/asm/linkage.h /usr/src/linux-5.14.21-150400.24.11/arch/x86/include/asm/nospec-branch.h /usr/src/linux-5.14.21-150400.24.11/arch/x86/include/asm/static_call.h /usr/src/linux-5.14.21-150400.24.11-obj/x86_64/default/Module.symvers
Sidenote: When building the driver with the kernel headers for 5.14.21...24 and running the same kernel, the build process still reports that those symbols are missing:
depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-modeset.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-peermem.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-drm.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-uvm.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia.ko needs unknown symbol __x86_return_thunk Warning: /lib/modules/5.14.21-150400.22-default is inconsistent
The reason is that this is for the .22 kernel (not the running .24 kernel).
So this is a bit confusing.
Anyway, the system is running again. Thanks to Patrick Shanahanfor putting me on the right track.
iiuc, when buiilding, *installed* kernels are concerned rather than *running* and when booting an older kernel, then nvidia driver installed for that kernel is retained. I hope that is sufficient explanation. -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.org openSUSE Community Member facebook/ptilopteri Photos: http://wahoo.no-ip.org/piwigo paka @ IRCnet oftc
![](https://seccdn.libravatar.org/avatar/8a4d6f03a8879432d8563aefbf48e787.jpg?s=120&d=mm&r=g)
Am 26.07.22 um 22:24 schrieb Patrick Shanahan:
* Aaron Digulla <digulla@hepe.com> [07-26-22 15:01]:
Am 26.07.22 um 20:33 schrieb Aaron Digulla:
Update: There were two problems.
The first one was that I had kernel-preempt-devel installed. For some reason, building the nvidia driver used those headers instead of the ones of the running kernel.
I assume that booting the kernel doesn't change the link /usr/src/linux. Correct?
The next thing was I tried to boot the old 5.14.21...22 kernel. That worked but when I compile the nvidia driver, it used the kernel headers for 5.14.21...24. Those contain the mysterious symbol __x86_return_thunk in
/usr/src/linux-5.14.21-150400.24.11/arch/x86/include/asm/linkage.h /usr/src/linux-5.14.21-150400.24.11/arch/x86/include/asm/nospec-branch.h /usr/src/linux-5.14.21-150400.24.11/arch/x86/include/asm/static_call.h /usr/src/linux-5.14.21-150400.24.11-obj/x86_64/default/Module.symvers
Sidenote: When building the driver with the kernel headers for 5.14.21...24 and running the same kernel, the build process still reports that those symbols are missing:
depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-modeset.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-peermem.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-drm.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia-uvm.ko needs unknown symbol __x86_return_thunk depmod: WARNING: //lib/modules/5.14.21-150400.22-default/updates/nvidia.ko needs unknown symbol __x86_return_thunk Warning: /lib/modules/5.14.21-150400.22-default is inconsistent
The reason is that this is for the .22 kernel (not the running .24 kernel).
So this is a bit confusing.
Anyway, the system is running again. Thanks to Patrick Shanahanfor putting me on the right track. iiuc, when building, *installed* kernels are concerned rather than *running* and when booting an older kernel, then nvidia driver installed for that kernel is retained.
In my case, the old driver was gone: There was just a single nvidia.ko in /lib. I could start X with noveau so I could at least do some research. I'm wondering why; I'm pretty sure that there was at least one kernel update since I upgraded LEAP from 15.3 to .4. Maybe this happened because the NVIDIA driver had no updates since 15.4 came out. But it's still strange that all old drivers were gone :-/ Maybe it's because 5.14.21-150400.24 now has an additional suffix (.11 atm) and DKMS can't handle that? Or it's because 5.14.21-150400.22-default is left over from LEAP 15.3? I've now tried to rebuild the driver with .22 and .24 and /lib now looks like this: 1547614 57848 -rw-r--r-- 1 root root 59235160 Jul 26 20:48 /lib/modules/5.14.21-150400.22-default/updates/nvidia.ko 1540700 4 lrwxrwxrwx 1 root root 56 Jul 26 18:28 /lib/modules/5.14.21-150400.24.11-default/weak-updates/updates/nvidia.ko -> /lib/modules/5.14.21-150400.22-default/updates/nvidia.ko As you can see, the .24.11 driver is just a link to the .22 driver but the .22 driver doesn't work with the .22 kernel because of missing symbols! What the heck??? Regards, -- Aaron "Optimizer" Digulla a.k.a. Philmann Dark "It's not the universe that's limited, it's our imagination. Follow me and I'll show you something beyond the limits." http://blog.pdark.de/
![](https://seccdn.libravatar.org/avatar/9435667f7160374bc34a8600b686aecd.jpg?s=120&d=mm&r=g)
On 28.07.2022 01:20, Aaron Digulla wrote:
Maybe it's because 5.14.21-150400.24 now has an additional suffix (.11 atm) and DKMS can't handle that?
Or it's because 5.14.21-150400.22-default is left over from LEAP 15.3?
I've now tried to rebuild the driver with .22 and .24
You tried to rebuild what exactly and how exactly? In all this long thread there no description that would allow anyone to troubleshoot and reproduce this. You cannot be using SUSE RPM because RPM does not use DKMS. Show exact commands, or better full protocol of executing these commands with complete output.
and /lib now looks like this:
1547614 57848 -rw-r--r-- 1 root root 59235160 Jul 26 20:48 /lib/modules/5.14.21-150400.22-default/updates/nvidia.ko 1540700 4 lrwxrwxrwx 1 root root 56 Jul 26 18:28 /lib/modules/5.14.21-150400.24.11-default/weak-updates/updates/nvidia.ko -> /lib/modules/5.14.21-150400.22-default/updates/nvidia.ko
As you can see, the .24.11 driver is just a link to the .22 driver but the .22 driver doesn't work with the .22 kernel because of missing symbols!
What the heck???
I am not even going to start guessing with zero information.
participants (3)
-
Aaron Digulla
-
Andrei Borzenkov
-
Patrick Shanahan