Re: nvidia-uvm and other kernel modules for the current/upcoming kernel
Am Montag, 7. Dezember 2020, 13:32:24 CET schrieb Martin Wilck:
On Mon, 2020-12-07 at 12:18 +0000, Peter Suetterlin wrote:
Martin Wilck wrote:
I'm running 455.38 for my optimus laptop. Self compiled via dkms, but the uvm module builds w/o issues.
With 5.9 or higher? Without patching the kernel?
Yes, of course. Just plain TW kernel.
Thanks. Just learned that Nvidia's 455 series seems to have a workaround for the 5.9 issue.
Yes, that seems to be the case, Martin. Stefan Dirsch expects a long-term release of that beast, which should be out in the next couple of days. Meanwhile, we're working an G04 fix for 5.10 (without CUDA support of course..). If all goes well, we're in a good shape for 5.10 NVIDIA-wise. The same *doesn't* hold true for drdb, Virtualbox, random external kernel module of your choice, unfortunately. Cheers, Pete
On Mon, 07 Dec 2020 15:00:40 +0100, Hans-Peter Jansen wrote:
Am Montag, 7. Dezember 2020, 13:32:24 CET schrieb Martin Wilck:
On Mon, 2020-12-07 at 12:18 +0000, Peter Suetterlin wrote:
Martin Wilck wrote:
I'm running 455.38 for my optimus laptop. Self compiled via dkms, but the uvm module builds w/o issues.
With 5.9 or higher? Without patching the kernel?
Yes, of course. Just plain TW kernel.
Thanks. Just learned that Nvidia's 455 series seems to have a workaround for the 5.9 issue.
Yes, that seems to be the case, Martin. Stefan Dirsch expects a long-term release of that beast, which should be out in the next couple of days.
Meanwhile, we're working an G04 fix for 5.10 (without CUDA support of course..).
If all goes well, we're in a good shape for 5.10 NVIDIA-wise. The same *doesn't* hold true for drdb, Virtualbox, random external kernel module of your choice, unfortunately.
Care to open a bug report if you know the build failure of specific KMP for 5.10? At least, drbd package maintainer is pretty responsive, AFAIK. I believe that we should put more packages in OBS Kernel:HEAD:KMP project to catch such build problems before the TW update. It doesn't contain VirtualBox; or is this intentionally removed? thanks, Takashi
On 07/12/2020 15.23, Takashi Iwai wrote:
On Mon, 07 Dec 2020 15:00:40 +0100, I believe that we should put more packages in OBS Kernel:HEAD:KMP project to catch such build problems before the TW update. It doesn't contain VirtualBox; or is this intentionally removed?
Maybe because the kmp was only split off recently into a multibuild. other kmps to link there: CoreFreq dpdk hdjmod lttng-modules mhvtl openafs oracleasm sysdig rtl8812au
On Mon, 07 Dec 2020 16:23:09 +0100, Bernhard M. Wiedemann wrote:
On 07/12/2020 15.23, Takashi Iwai wrote:
On Mon, 07 Dec 2020 15:00:40 +0100, I believe that we should put more packages in OBS Kernel:HEAD:KMP project to catch such build problems before the TW update. It doesn't contain VirtualBox; or is this intentionally removed?
Maybe because the kmp was only split off recently into a multibuild.
other kmps to link there: CoreFreq dpdk hdjmod lttng-modules mhvtl openafs oracleasm sysdig rtl8812au
Thanks. Michal, could you create those links on Kernel:HEAD:KMP (also Kernel:stable:KMP, too)? Takashi
Hello, On Mon, Dec 07, 2020 at 04:44:04PM +0100, Takashi Iwai wrote:
On Mon, 07 Dec 2020 16:23:09 +0100, Bernhard M. Wiedemann wrote:
On 07/12/2020 15.23, Takashi Iwai wrote:
On Mon, 07 Dec 2020 15:00:40 +0100, I believe that we should put more packages in OBS Kernel:HEAD:KMP project to catch such build problems before the TW update. It doesn't contain VirtualBox; or is this intentionally removed?
Maybe because the kmp was only split off recently into a multibuild.
other kmps to link there: CoreFreq dpdk hdjmod lttng-modules mhvtl openafs oracleasm sysdig rtl8812au
It does not make much sense with oracleasm. It is not included in openSUSE and does not build against upstream kernel.
Thanks Michal
On Mon, Dec 07, 2020 at 04:44:04PM +0100, Takashi Iwai wrote:
On Mon, 07 Dec 2020 16:23:09 +0100, Bernhard M. Wiedemann wrote:
On 07/12/2020 15.23, Takashi Iwai wrote:
On Mon, 07 Dec 2020 15:00:40 +0100, I believe that we should put more packages in OBS Kernel:HEAD:KMP project to catch such build problems before the TW update. It doesn't contain VirtualBox; or is this intentionally removed?
Maybe because the kmp was only split off recently into a multibuild.
other kmps to link there: CoreFreq dpdk hdjmod lttng-modules mhvtl openafs oracleasm sysdig rtl8812au
v4l2loopback is also quite popular. Michal
Michal, could you create those links on Kernel:HEAD:KMP (also Kernel:stable:KMP, too)?
Am Dienstag, 8. Dezember 2020, 11:05:30 CET schrieb Michal Kubecek:
On Mon, Dec 07, 2020 at 04:44:04PM +0100, Takashi Iwai wrote:
On Mon, 07 Dec 2020 16:23:09 +0100, Bernhard M. Wiedemann wrote:
On 07/12/2020 15.23, Takashi Iwai wrote:
On Mon, 07 Dec 2020 15:00:40 +0100, I believe that we should put more packages in OBS Kernel:HEAD:KMP project to catch such build problems before the TW update. It doesn't contain VirtualBox; or is this intentionally removed?
Maybe because the kmp was only split off recently into a multibuild.
other kmps to link there: CoreFreq dpdk hdjmod lttng-modules mhvtl openafs oracleasm sysdig rtl8812au
v4l2loopback is also quite popular.
as being required from Jitsi for session recording IIRC. Pete
On 08/12/2020 11.05, Michal Kubecek wrote:
On Mon, Dec 07, 2020 at 04:44:04PM +0100, Takashi Iwai wrote:
On Mon, 07 Dec 2020 16:23:09 +0100, Bernhard M. Wiedemann wrote:
other kmps to link there: CoreFreq dpdk hdjmod lttng-modules mhvtl openafs oracleasm sysdig rtl8812au
v4l2loopback is also quite popular.
found one more: msr-safe
Am Montag, 7. Dezember 2020, 15:23:53 CET schrieb Takashi Iwai:
On Mon, 07 Dec 2020 15:00:40 +0100,
Hans-Peter Jansen wrote:
Yes, that seems to be the case, Martin. Stefan Dirsch expects a long-term release of that beast, which should be out in the next couple of days.
Meanwhile, we're working an G04 fix for 5.10 (without CUDA support of course..).
If all goes well, we're in a good shape for 5.10 NVIDIA-wise. The same *doesn't* hold true for drdb, Virtualbox, random external kernel module of your choice, unfortunately.
Care to open a bug report if you know the build failure of specific KMP for 5.10? At least, drbd package maintainer is pretty responsive, AFAIK.
https://bugzilla.opensuse.org/show_bug.cgi?id=1179708 If you ever looked into this package, this is no wonder, as it massages the code with coccinelle on the flight... Technically ambitious, but a saying of Linus Torvalds comes to mind: https://lore.kernel.org/lkml/Pine.LNX.4.44.0207141708470.20233-100000@home.t...
I believe that we should put more packages in OBS Kernel:HEAD:KMP project to catch such build problems before the TW update.
Yes, that would be nice and is exactly, what I do in my kernel projects: home:frispete:kernel{,:HEAD}
It doesn't contain VirtualBox; or is this intentionally removed?
Well, VB is special in many ways, which starts by being owned from Oracle. If suffers from: error: implicit declaration of function 'alloc_vm_area' Just another fallout of Christoph Hellwig's crusade against off tree kernel modules, in the sake of clean ups, of course. But a fix is in the works: https://www.virtualbox.org/ticket/20055 Working on incorporating this patch in our build now.. Cheers, Pete
Am Montag, 7. Dezember 2020, 17:07:29 CET schrieb Hans-Peter Jansen:
Am Montag, 7. Dezember 2020, 15:23:53 CET schrieb Takashi Iwai:
It doesn't contain VirtualBox; or is this intentionally removed?
Well, VB is special in many ways, which starts by being owned from Oracle.
If suffers from: error: implicit declaration of function 'alloc_vm_area'
Just another fallout of Christoph Hellwig's crusade against off tree kernel modules, in the sake of clean ups, of course.
This was too harsh. Sorry. Below issue resulted from some serious cleanups, I'm afraid.
But a fix is in the works: https://www.virtualbox.org/ticket/20055
Working on incorporating this patch in our build now..
Stuck again. 5.10 lost address space overrides, and therefor USER_DS by commit 47058bb54b57962b3958a936ddbc59355e4c5504 for x86, which is referenced in the shared folder code of VB: +1398 src/VBox/Additions/linux/sharedfolders/regops.c: static int vbsf_lock_user_pages_failed_check_kernel(uintptr_t uPtrFrom, size_t cPages, bool fWrite, int rcFailed, struct page **papPages, bool *pfLockPgHack) { /* * Check that this is valid user memory that is actually in the kernel range. */ #if RTLNX_VER_MIN(5,0,0) || RTLNX_RHEL_MIN(8,1) if ( access_ok((void *)uPtrFrom, cPages << PAGE_SHIFT) && uPtrFrom >= USER_DS.seg) #else if ( access_ok(fWrite ? VERIFY_WRITE : VERIFY_READ, (void *)uPtrFrom, cPages << PAGE_SHIFT) && uPtrFrom >= USER_DS.seg) #endif { int rc = vbsf_lock_kernel_pages((uint8_t *)uPtrFrom, fWrite, cPages, papPages); if (rc == 0) { *pfLockPgHack = true; return 0; } } return rcFailed; } Any idea, how the equivalent of the "uPtrFrom >= USER_DS.seg" term should look like today? Any hints appreciated. Pete
On Tue, 08 Dec 2020 10:20:42 +0100, Hans-Peter Jansen wrote:
Am Montag, 7. Dezember 2020, 17:07:29 CET schrieb Hans-Peter Jansen:
Am Montag, 7. Dezember 2020, 15:23:53 CET schrieb Takashi Iwai:
It doesn't contain VirtualBox; or is this intentionally removed?
Well, VB is special in many ways, which starts by being owned from Oracle.
If suffers from: error: implicit declaration of function 'alloc_vm_area'
Just another fallout of Christoph Hellwig's crusade against off tree kernel modules, in the sake of clean ups, of course.
This was too harsh. Sorry.
Below issue resulted from some serious cleanups, I'm afraid.
But a fix is in the works: https://www.virtualbox.org/ticket/20055
Working on incorporating this patch in our build now..
Stuck again. 5.10 lost address space overrides, and therefor USER_DS by commit 47058bb54b57962b3958a936ddbc59355e4c5504 for x86, which is referenced in the shared folder code of VB:
+1398 src/VBox/Additions/linux/sharedfolders/regops.c:
static int vbsf_lock_user_pages_failed_check_kernel(uintptr_t uPtrFrom, size_t cPages, bool fWrite, int rcFailed, struct page **papPages, bool *pfLockPgHack) { /* * Check that this is valid user memory that is actually in the kernel range. */ #if RTLNX_VER_MIN(5,0,0) || RTLNX_RHEL_MIN(8,1) if ( access_ok((void *)uPtrFrom, cPages << PAGE_SHIFT) && uPtrFrom >= USER_DS.seg) #else if ( access_ok(fWrite ? VERIFY_WRITE : VERIFY_READ, (void *)uPtrFrom, cPages << PAGE_SHIFT) && uPtrFrom >= USER_DS.seg) #endif { int rc = vbsf_lock_kernel_pages((uint8_t *)uPtrFrom, fWrite, cPages, papPages); if (rc == 0) { *pfLockPgHack = true; return 0; } }
return rcFailed; }
Any idea, how the equivalent of the "uPtrFrom >= USER_DS.seg" term should look like today?
I guess that a simplest workaround would be replace USER_DS.seg with TASK_SIZE_MAX. But since I didn't take a look at the code closely enough, I'm not entirely sure what this function is supposed to work. Takashi
On Tue, Dec 08, 2020 at 10:20:42AM +0100, Hans-Peter Jansen wrote:
Am Montag, 7. Dezember 2020, 17:07:29 CET schrieb Hans-Peter Jansen:
Am Montag, 7. Dezember 2020, 15:23:53 CET schrieb Takashi Iwai:
It doesn't contain VirtualBox; or is this intentionally removed?
Well, VB is special in many ways, which starts by being owned from Oracle.
If suffers from: error: implicit declaration of function 'alloc_vm_area'
Just another fallout of Christoph Hellwig's crusade against off tree kernel modules, in the sake of clean ups, of course.
This was too harsh. Sorry.
Below issue resulted from some serious cleanups, I'm afraid.
But a fix is in the works: https://www.virtualbox.org/ticket/20055
Working on incorporating this patch in our build now..
Stuck again. 5.10 lost address space overrides, and therefor USER_DS by commit 47058bb54b57962b3958a936ddbc59355e4c5504 for x86, which is referenced in the shared folder code of VB:
+1398 src/VBox/Additions/linux/sharedfolders/regops.c:
static int vbsf_lock_user_pages_failed_check_kernel(uintptr_t uPtrFrom, size_t cPages, bool fWrite, int rcFailed, struct page **papPages, bool *pfLockPgHack) { /* * Check that this is valid user memory that is actually in the kernel range. */ #if RTLNX_VER_MIN(5,0,0) || RTLNX_RHEL_MIN(8,1) if ( access_ok((void *)uPtrFrom, cPages << PAGE_SHIFT) && uPtrFrom >= USER_DS.seg) #else if ( access_ok(fWrite ? VERIFY_WRITE : VERIFY_READ, (void *)uPtrFrom, cPages << PAGE_SHIFT) && uPtrFrom >= USER_DS.seg) #endif { int rc = vbsf_lock_kernel_pages((uint8_t *)uPtrFrom, fWrite, cPages, papPages); if (rc == 0) { *pfLockPgHack = true; return 0; } }
return rcFailed; }
Any idea, how the equivalent of the "uPtrFrom >= USER_DS.seg" term should look like today?
Based on what USER_DS used to be defined as, TASK_SIZE_MAX should be a direct replacement. But maybe you rather want something like copy_from_kernel_nofault_allowed(uPtrFrom), it's hard to say without knowledge of what the code is used for. Michal
Am Dienstag, 8. Dezember 2020, 10:58:13 CET schrieb Michal Kubecek:
On Tue, Dec 08, 2020 at 10:20:42AM +0100, Hans-Peter Jansen wrote:
Am Montag, 7. Dezember 2020, 17:07:29 CET schrieb Hans-Peter Jansen:
Any idea, how the equivalent of the "uPtrFrom >= USER_DS.seg" term should look like today?
Takashi && Michal, thanks for your help.
Based on what USER_DS used to be defined as, TASK_SIZE_MAX should be a direct replacement. But maybe you rather want something like copy_from_kernel_nofault_allowed(uPtrFrom), it's hard to say without knowledge of what the code is used for.
This function is accompanied with a comment: /** * Catches kernel_read() and kernel_write() calls and works around them. * * The file_operations::read and file_operations::write callbacks supposedly * hands us the user buffers to read into and write out of. To allow the kernel * to read and write without allocating buffers in userland, they kernel_read() * and kernel_write() increases the user space address limit before calling us * so that copyin/copyout won't reject it. Our problem is that get_user_pages() * works on the userspace address space structures and will not be fooled by an * increased addr_limit. * * This code tries to detect this situation and fake get_user_lock() for the * kernel buffer. */ Good news, the code compiles at least. Sorry to bother you even more, but I see a related issue in lime_kmp, the Linux Memory Extractor https://github.com/504ensicsLabs/LiME, that is able to produce memory dumps in a more forensically sound way then other such tools. Unfortunately, it is playing games with set_fs(KERNEL_DS): https://build.opensuse.org/package/live_build_log/home:frispete:kernel:HEAD/... Somebody provided fixes for 5.10, that doesn't make sense in my uneducated eyes: https://github.com/504ensicsLabs/LiME/pull/83/files If I understand the code correctly, it tries to trick the kernel into believing, that all calls stems from the kernel. This reveals the questions, what advantage that procedure buys us, and at what costs (risks)? Most probably, this set_fs dance can be eliminated, or replaced with something, our kernel adepts correspond to. Ideas, opinions? Pete
On Tue, Dec 08, 2020 at 12:12:44PM +0100, Hans-Peter Jansen wrote:
Sorry to bother you even more, but I see a related issue in lime_kmp, the Linux Memory Extractor https://github.com/504ensicsLabs/LiME, that is able to produce memory dumps in a more forensically sound way then other such tools.
Unfortunately, it is playing games with set_fs(KERNEL_DS):
https://build.opensuse.org/package/live_build_log/home:frispete:kernel:HEAD/...
Somebody provided fixes for 5.10, that doesn't make sense in my uneducated eyes: https://github.com/504ensicsLabs/LiME/pull/83/files
If I understand the code correctly, it tries to trick the kernel into believing, that all calls stems from the kernel. This reveals the questions, what advantage that procedure buys us, and at what costs (risks)?
Most probably, this set_fs dance can be eliminated, or replaced with something, our kernel adepts correspond to.
Ideas, opinions?
The set_fs(KERNEL_DS) trick is mostly used when kernel code wants to call a function which is intended to work on userspace buffer with a buffer in kernel memory; set_fs(KERNEL_DS) tricks the function to recognize a kernel space address as userspace one so that the checks pass. See https://lwn.net/Articles/832121/ for more details. A typical example are ->write() and ->read() file operations which should be replaced by kernel_write() and kernel_read() helpers (these shouldn't need wrapping with set_fs(KERNEL_DS) even on older kernels). There are two catches, though: first, their argument changed in 4.14 so that you need to distinguish between <4.14 and >=4.14. Second, starting with v5.10-rc1, kernel_write() and kernel_read() can only be used for file types converted to ->write_iter() and ->read_iter() ops. Some file types do not provide these and, worse, attempts to address that may face pushback from certain kernel developer as it happened recently for eventfd. Looking at the lime code, I suspect that most of the places did not actually need the set_fs(KERNEL_DS) trick even on kernels before 5.10 as filp_open(), filp_close() or kernel_*() should work without it. But I'm not so familiar with this code so I would need to do more research to be sure. Michal
Am Dienstag, 8. Dezember 2020, 13:14:12 CET schrieb Michal Kubecek:
On Tue, Dec 08, 2020 at 12:12:44PM +0100, Hans-Peter Jansen wrote:
Sorry to bother you even more, but I see a related issue in lime_kmp, the Linux Memory Extractor https://github.com/504ensicsLabs/LiME, that is able to produce memory dumps in a more forensically sound way then other such tools.
Unfortunately, it is playing games with set_fs(KERNEL_DS):
https://build.opensuse.org/package/live_build_log/home:frispete:kernel:HEA D/lime-kmp/openSUSE_Tumbleweed/x86_64
Somebody provided fixes for 5.10, that doesn't make sense in my uneducated eyes: https://github.com/504ensicsLabs/LiME/pull/83/files
If I understand the code correctly, it tries to trick the kernel into believing, that all calls stems from the kernel. This reveals the questions, what advantage that procedure buys us, and at what costs (risks)?
Most probably, this set_fs dance can be eliminated, or replaced with something, our kernel adepts correspond to.
Ideas, opinions?
The set_fs(KERNEL_DS) trick is mostly used when kernel code wants to call a function which is intended to work on userspace buffer with a buffer in kernel memory; set_fs(KERNEL_DS) tricks the function to recognize a kernel space address as userspace one so that the checks pass. See https://lwn.net/Articles/832121/ for more details.
A typical example are ->write() and ->read() file operations which should be replaced by kernel_write() and kernel_read() helpers (these shouldn't need wrapping with set_fs(KERNEL_DS) even on older kernels). There are two catches, though: first, their argument changed in 4.14 so that you need to distinguish between <4.14 and >=4.14. Second, starting with v5.10-rc1, kernel_write() and kernel_read() can only be used for file types converted to ->write_iter() and ->read_iter() ops. Some file types do not provide these and, worse, attempts to address that may face pushback from certain kernel developer as it happened recently for eventfd.
Looking at the lime code, I suspect that most of the places did not actually need the set_fs(KERNEL_DS) trick even on kernels before 5.10 as filp_open(), filp_close() or kernel_*() should work without it. But I'm not so familiar with this code so I would need to do more research to be sure.
Thank you very much, Michal. It is very helpful to get an assessment from someone who has a much deeper insight into the matter. Will supply a patch that removes the set_fs dance conditionally and check the outcome. Cheers, Pete
[Lame self reply and cross post] Am Montag, 7. Dezember 2020, 15:00:40 CET schrieb Hans-Peter Jansen:
If all goes well, we're in a good shape for 5.10 NVIDIA-wise. The same *doesn't* hold true for drdb, Virtualbox, random external kernel module of your choice, unfortunately.
Good news! Believe it or not, but we're slowly getting into a good shape for 5.10 (expected this weekend). https://build.opensuse.org/project/monitor/home:frispete:kernel:HEAD Yay, Pete
participants (5)
-
Bernhard M. Wiedemann
-
Hans-Peter Jansen
-
Michal Kubecek
-
Michal Suchánek
-
Takashi Iwai