On Tue, Nov 26, 2024 at 10:55 PM Aaron Puchert <aaronpuchert@alice-dsl.net> wrote:
Am 26.11.24 um 00:22 schrieb Andrii Nakryiko:
Yes, stripped out ELF symbols are pretty annoying. But. That's a separate issue. And also it's possible to avoid needing them by capturing build ID and doing symbolization offline using build ID to lookup DWARF information for profiled executable/library.
The important part, again, is *capturing the stack trace*. You seem to be more worried about symbolization, which is an entirely different problem.
Isn't this entire proposal about the usability of profiling? I'm not just worried about symbolization, but about the bigger picture. Especially if we're profiling the entire system, we're easily talking about gigabytes of debug info to download.
With my compiler hat on, frame pointers often don't make sense. If there are no VLAs and allocas, the compiler knows how large a stack frame is and can simply add (for stacks that grow down) that fixed size in the epilogue. The frame pointer on the stack contains no information that the compiler doesn't already have.
Sure, from compiler point of view. But here the concern is profilers and other stack trace-based tools used for observability, profiling, and debugging.
My concern is not just profilers, since we're talking about a distribution default. The point that others have made and that I've just repeated here is that we're spending a register on a relatively register-sparse architecture for something that's not relevant for the program itself, but instrumentation for outside observers. And shipping with instrumentation by default just doesn't feel right. That is not a "real" argument, I know.
I think this is a matter of perspective. My point of view is that not having out-of-the-box instrumentation makes it hard for observability to develop further in spaces outside of the Kuberentes/Go case (where this is already default).
Profiling the workload is much more common than you might think, and as it becomes more accessible (because frame pointers are there and tools just work out of the box), it just will be even more frequently used. Even if user doesn't do its own performance investigation, original application author can send simple perf-based (or whatnot) commands to run and report back data for further optimization.
That's a bit hypothetical. In my experience, profiling is not even terribly common among C++ developers, which is to say that most teams have dedicated performance experts and most others rarely touch a profiler. Most developers are profiling their own applications, after just building them. That seems natural, after all you'll want to improve performance and for that you'll need to touch the source and recompile.
Application authors or package maintainers asking their users for profiling sounds reasonable, but I haven't seen it yet. Realistically, if users complain about a performance problem, it's probably big enough that the overhead of DWARF and a reduced sample frequency are not an obstacle. And even truncated stacks could be enough for the developer to figure out the root cause.
This is something that both the GNOME and KDE communities are actively exploring. I know that the GNOME folks who maintain Sysprof have been working on making the tool better for this *exact* use-case. I'm sure folks that work on KDE are working on something similar.
For what it's worth, I've just today profiled an awfully slow clang-tidy job and I could have attached a debugger to see what was wrong.
There is zero doubt that whatever we (Meta) lost due to disabling frame pointers has been recovered many times over through high quality and widely available profiling data.
Meta has a large paid workforce though, while this is at least in parts a community project. If SUSE wants to add FPs because they need it to improve the distro, I don't think anybody is going to stand in their way. But at least right now I don't think they have dedicated people for performance, and I'm not aware of anyone regularly doing this kind of work in the community.
If they exist, please step up. All we have is a proposal that was copied from Fedora and no Tumbleweed user that said "I want this and this is how I would use it."
Most of the people who want this stuff are generally unable to figure out how to advocate for this in distributions. That's why *I* did it. Also, as a Tumbleweed user, I would hope that *I* count. I would also like to see us doing this more in KDE (and KDE upstream does qualification for Linux using openSUSE), and whole-system tracing requires the whole distribution being built for it.
Meta is also a data center operator, while SUSE mostly ships software to my knowledge. So they don't have a big server farm where they could or would want to do whole system profiling. Individual maintainers might do performance work on their packages, but for that they don't need FPs as a distro default.
On the contrary, performance bottlenecks can show up in random places. Whole-system tracing is incredibly useful with modern application middleware that can go fairly deep.
SUSE customers might want this for their deployments, but this is again speculation. If that is actually the case, I think someone from SUSE should tell us. My employer (a large SUSE customer I believe) would probably not be interested so much, because we mainly run our own binaries and use just the OS base in deployment.
As long as some of the OS is linked in your application stack, it can be useful. -- 真実はいつも一つ!/ Always, there's only one truth!