Am Dienstag, 8. Dezember 2020, 13:14:12 CET schrieb Michal Kubecek:
On Tue, Dec 08, 2020 at 12:12:44PM +0100, Hans-Peter Jansen wrote:
Sorry to bother you even more, but I see a related issue in lime_kmp, the Linux Memory Extractor https://github.com/504ensicsLabs/LiME, that is able to produce memory dumps in a more forensically sound way then other such tools.
Unfortunately, it is playing games with set_fs(KERNEL_DS):
https://build.opensuse.org/package/live_build_log/home:frispete:kernel:HEA D/lime-kmp/openSUSE_Tumbleweed/x86_64
Somebody provided fixes for 5.10, that doesn't make sense in my uneducated eyes: https://github.com/504ensicsLabs/LiME/pull/83/files
If I understand the code correctly, it tries to trick the kernel into believing, that all calls stems from the kernel. This reveals the questions, what advantage that procedure buys us, and at what costs (risks)?
Most probably, this set_fs dance can be eliminated, or replaced with something, our kernel adepts correspond to.
Ideas, opinions?
The set_fs(KERNEL_DS) trick is mostly used when kernel code wants to call a function which is intended to work on userspace buffer with a buffer in kernel memory; set_fs(KERNEL_DS) tricks the function to recognize a kernel space address as userspace one so that the checks pass. See https://lwn.net/Articles/832121/ for more details.
A typical example are ->write() and ->read() file operations which should be replaced by kernel_write() and kernel_read() helpers (these shouldn't need wrapping with set_fs(KERNEL_DS) even on older kernels). There are two catches, though: first, their argument changed in 4.14 so that you need to distinguish between <4.14 and >=4.14. Second, starting with v5.10-rc1, kernel_write() and kernel_read() can only be used for file types converted to ->write_iter() and ->read_iter() ops. Some file types do not provide these and, worse, attempts to address that may face pushback from certain kernel developer as it happened recently for eventfd.
Looking at the lime code, I suspect that most of the places did not actually need the set_fs(KERNEL_DS) trick even on kernels before 5.10 as filp_open(), filp_close() or kernel_*() should work without it. But I'm not so familiar with this code so I would need to do more research to be sure.
Thank you very much, Michal. It is very helpful to get an assessment from someone who has a much deeper insight into the matter. Will supply a patch that removes the set_fs dance conditionally and check the outcome. Cheers, Pete