[Bug 1200487] New: LTO: the effect on the kernel
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487 Bug ID: 1200487 Summary: LTO: the effect on the kernel Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-performance-bugs@suse.de Reporter: jslaby@suse.com QA Contact: qa-bugs@suse.de CC: mliska@suse.cz Found By: --- Blocker: --- May I ask the performance team to perform some quick runtime performance measurements of the effect of Link time optimizations (LTO) performed on the kernel? There is a project with LTO patches and LTO enabled on the top of 5.18.3: https://build.suse.de/project/monitor/home:jirislaby:kernel-lto And the same source base, but LTO disabled by the config: https://build.suse.de/project/monitor/home:jirislaby:kernel-lto-disabled The diff between those is minimal: https://build.suse.de/package/rdiff/home:jirislaby:kernel-lto-disabled/kernel-source?opackage=kernel-source&oproject=home%3Ajirislaby%3Akernel-lto&rev=3 Thanks. -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
Mian Yousaf Kaukab
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c1
--- Comment #1 from Jiri Slaby
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c2
--- Comment #2 from Mel Gorman
Just a note: it's still pretty fragile. So if it crashes, just please let me know your setup and the crash. I'll try to fix it up.
Are there any toolchain support changes or is SLE 15 SP4 userspace suitable? Are there kernel.org branches should LTO enabled vs disabled? Factory as userspace would be problematic as it keeps changing and has limited to no remote deployment support currently available. I also need to have an idea of what patches are applied just to make LTO an option because the real comparison should not be between LTO enabled vs disabled but between vanilla mainline vs LTO-disabled vs LTO-enabled. Given that it is expected to be fragile and likely break, it'll be only queued against one machine as a starting point. -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c3
--- Comment #3 from Jiri Slaby
(In reply to Jiri Slaby from comment #1)
Just a note: it's still pretty fragile. So if it crashes, just please let me know your setup and the crash. I'll try to fix it up.
Are there any toolchain support changes or is SLE 15 SP4 userspace suitable?
I am not sure if you mean for building or running. Building: lto is supported for a long time in gcc, so gcc-7 (and the rest of the 15sp4's toolchain) should be fine. But noone tried, so I would expect glitches. Running: It won't run there as it is currently built (i.e. against Tumbleweed). But it's quite easy to fix the IBS projects to build against backports, so that it runs on 15sp4 just fine.
Are there kernel.org branches should LTO enabled vs disabled?
Sorry, I don't understand what you are asking here. Do you want me to push an upstream kernel w/ and w/o LTO patches to git.kernel.org?
Factory as userspace would be problematic as it keeps changing and has limited to no remote deployment support currently available. I also need to have an idea of what patches are applied just to make LTO an option because the real comparison should not be between LTO enabled vs disabled but between vanilla mainline vs LTO-disabled vs LTO-enabled. Given that it is expected to be fragile and likely break, it'll be only queued against one machine as a starting point.
OK, the project setup allows for an easy diff -- it's linked against Kernel:stable. So the patches (and a config diff) are these: https://build.suse.de/package/rdiff/home:jirislaby:kernel-lto/kernel-source?opackage=kernel-source&oproject=Devel%3AKernel%3Astable Or you can download patches.addon.tar.bz2 from home:jirislaby:kernel-lto directly: https://build.suse.de/source/home:jirislaby:kernel-lto/kernel-source/patches... And the config changes too: https://build.suse.de/source/home:jirislaby:kernel-lto/kernel-source/config.... -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c4
--- Comment #4 from Jiri Slaby
Building: lto is supported for a long time in gcc, so gcc-7 (and the rest of the 15sp4's toolchain) should be fine. But noone tried, so I would expect glitches.
Amending to this: I am building the LTO kernels on kunlun (15sp3) just fine. Only that I have gcc-12 from devel:gcc/SLE15. -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c5
--- Comment #5 from Mel Gorman
(In reply to Mel Gorman from comment #2)
(In reply to Jiri Slaby from comment #1)
Just a note: it's still pretty fragile. So if it crashes, just please let me know your setup and the crash. I'll try to fix it up.
Are there any toolchain support changes or is SLE 15 SP4 userspace suitable?
I am not sure if you mean for building or running.
Both, can it be built and booted from a stable userspace (e.g. SLE 15 SP4)?
Building: lto is supported for a long time in gcc, so gcc-7 (and the rest of the 15sp4's toolchain) should be fine. But noone tried, so I would expect glitches.
Running: It won't run there as it is currently built (i.e. against Tumbleweed). But it's quite easy to fix the IBS projects to build against backports, so that it runs on 15sp4 just fine.
Are there kernel.org branches should LTO enabled vs disabled?
Sorry, I don't understand what you are asking here. Do you want me to push an upstream kernel w/ and w/o LTO patches to git.kernel.org?
I don't blame you, my question made no sense. I meant -- are there kernel-source.git branches with any modifications necessary made? That way, I could use the minimal SLE patches against mainline as the baseline, then LTO disabled, then LTO enabled.
Factory as userspace would be problematic as it keeps changing and has limited to no remote deployment support currently available. I also need to have an idea of what patches are applied just to make LTO an option because the real comparison should not be between LTO enabled vs disabled but between vanilla mainline vs LTO-disabled vs LTO-enabled. Given that it is expected to be fragile and likely break, it'll be only queued against one machine as a starting point.
OK, the project setup allows for an easy diff -- it's linked against Kernel:stable. So the patches (and a config diff) are these: https://build.suse.de/package/rdiff/home:jirislaby:kernel-lto/kernel- source?opackage=kernel-source&oproject=Devel%3AKernel%3Astable
Or you can download patches.addon.tar.bz2 from home:jirislaby:kernel-lto directly: https://build.suse.de/source/home:jirislaby:kernel-lto/kernel-source/patches. addon.tar.bz2
I can do this if a branch is not available. The only concern would be that the baseline is against a different commit and I want to reduce as many variables as possible. I can add the gcc-12 repo no problem but I want to ensure that each of the three kernels (baseline, LTO disabled, LTO enabled) are definitely built from the same toolchain. I can use the rpms but it is not preferred. -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c6
--- Comment #6 from Jiri Slaby
I don't blame you, my question made no sense. I meant -- are there kernel-source.git branches with any modifications necessary made? That way, I could use the minimal SLE patches against mainline as the baseline, then LTO disabled, then LTO enabled.
OK, so I pushed: origin/users/jslaby/stable/lto origin/users/jslaby/stable/lto-base origin/users/jslaby/stable/lto-disabled lto-base is base for both, lto-disabled is the same as lto, but config. Then I compiled the lto one on 15sp3 using gcc-12 and booted successfully on 15sp4.
That way, I could use the minimal SLE patches against mainline as the baseline, then LTO disabled, then LTO enabled.
The patches are not that easy backportable, so using them on anything older (like 15sp4 kernel) creates a lot of conflicts. I.e. all the above is based on the stable branch (5.18). -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
Martin Li��ka
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
Martin Li��ka
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c7
--- Comment #7 from Mel Gorman
(In reply to Mel Gorman from comment #5)
I don't blame you, my question made no sense. I meant -- are there kernel-source.git branches with any modifications necessary made? That way, I could use the minimal SLE patches against mainline as the baseline, then LTO disabled, then LTO enabled.
OK, so I pushed: origin/users/jslaby/stable/lto origin/users/jslaby/stable/lto-base origin/users/jslaby/stable/lto-disabled
lto-base is base for both, lto-disabled is the same as lto, but config.
Thanks.
Then I compiled the lto one on 15sp3 using gcc-12 and booted successfully on 15sp4.
That way, I could use the minimal SLE patches against mainline as the baseline, then LTO disabled, then LTO enabled.
The patches are not that easy backportable, so using them on anything older (like 15sp4 kernel) creates a lot of conflicts. I.e. all the above is based on the stable branch (5.18).
I was not expecting a backport, the confusion may be that I wanted SLE userspace so the risk of change over time. The first of the tests have started on one machine (2-socket broadwell) and gcc-12 was used to build the kernel. Linux version 5.18.4-lto-baseline (root@hardy3) (gcc-12 (SUSE Linux) 12.1.1 20220517 [revision 325d82b08696da17fb26bd2e1b6ba607649357fb], GNU ld (GNU Binutils; SUSE Linux Enterprise 15) 2.37.20211103-150100.7.29) #1 SMP PREEMPT_DYNAMIC Fri Jun 17 17:30:35 CEST 2022 -- You are receiving this mail because: You are on the CC list for the bug.
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c8
--- Comment #8 from Mel Gorman
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c9
--- Comment #9 from Martin Li��ka
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c10
--- Comment #10 from Mel Gorman
![](https://seccdn.libravatar.org/avatar/a895f78a81a109471893519443e4d933.jpg?s=120&d=mm&r=g)
https://bugzilla.suse.com/show_bug.cgi?id=1200487
https://bugzilla.suse.com/show_bug.cgi?id=1200487#c11
--- Comment #11 from Jiri Slaby
Results show some small improvements for some workloads and at least one functional failure. The mmtests configuration that failed is functional-ltp-cve-xfs for the pty03. The other tests were not complete as I write this as the machine had frozen but will be started shortly. The dmesg for the failure was
[ 2465.240179] INFO: task pty03:10270 blocked for more than 491 seconds. [ 2465.240188] Tainted: G E 5.18.4-lto-enabled #1 [ 2465.240192] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2465.240195] task:pty03 state:D stack: 0 pid:10270 ppid: 1 flags:0x00004004 [ 2465.240204] Call Trace: [ 2465.240207] <TASK> [ 2465.240212] __schedule+0x344/0x12e0 [ 2465.240231] ? number+0x349/0x400 [ 2465.240242] schedule+0x5b/0xd0 [ 2465.240249] schedule_timeout+0x9f/0xd0 [ 2465.240260] wait_for_completion+0x7d/0x130 [ 2465.240267] devtmpfs_delete_node+0xc7/0x100 [ 2465.240276] ? kernfs_put+0x107/0x1c0 [ 2465.240290] device_del+0x408/0x4f0 [ 2465.240304] ? class_find_device+0xde/0x1a0 [ 2465.240313] device_unregister+0xe/0x60 [ 2465.240320] tty_unregister_device+0x5c/0xc0 [ 2465.240335] gsmld_close+0x41/0xc0 [n_gsm 3758e2ed62c48c2b7e0f63935531528aa58bfb68]
I'm afraid this is a consequence of 5.18 changes in n_gsm done by D. Starke. We've exchanged few messages with him, but he is unable to reproduce. I'm not sure anyone cares about n_gsm except him. Maybe Siemens is trying to resurrect it... So the easiest way to work around this is simply disabling CONFIG_N_GSM -- would it be possible to restart the functional tests without n_gsm?
[ 2465.240348] tty_ldisc_hangup+0x131/0x240 [ 2465.240358] __tty_hangup+0x1f3/0x370 [ 2465.240366] tty_ioctl+0x74e/0x9d0
...
Otherwise it was marginal differences.
Thanks for the measurements. I'm a bit surprised there are any differences. I'll look into the results more deeply. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com