Am 16.11.24 um 19:24 schrieb Neal Gompa:
I would like to change openSUSE's default build flags in rpm and rpm-config-SUSE to incorporate frame pointers (and equivalents on other architectures) to support real-time profiling and observability on SUSE distributions.
Profiling doesn't necessarily need call stacks. Often you can learn a great deal from just seeing the hot functions. I'd say that I work with plain profiles ("perf record" without -g) most of the time and add call stacks only when I'm unsure why a function is so frequently called. Stacks can sometimes even obscure hotspots, depending on how you're reading them: if a hot function distributes its cost among many different call stacks, it will not stick out in a top-down flame graph. (You'll have to do a bottom-up.) There is another obstacle, one that applies even to profiling without call stacks: there are no symbols. We strip .symtab along with debug info. Without that, the stacks that you get are meaningless. However, if you meant to include that: I would like symbols to be kept (basically replace --strip by --strip-debug), and I think it would be less controversial: no runtime impact and only slightly larger binaries. With my compiler hat on, frame pointers often don't make sense. If there are no VLAs and allocas, the compiler knows how large a stack frame is and can simply add (for stacks that grow down) that fixed size in the epilogue. The frame pointer on the stack contains no information that the compiler doesn't already have. Furthermore, the frame pointer could be overwritten by stack buffer overflows. That's not a very strong argument, because an attacker would much rather overwrite the return address, and we have stack protectors, but "don't compute at runtime what you can compute at compile-time".
* The performance hit for having it vs not is insignificant[6].
This is a tricky argument. There are lots of things one could do that individually have little impact (maybe 1–10%), but those little things add up. One guy wants to add frame pointers for profiling, another wants stack protectors, the next guy wants automatic initialization of local variables (likely C++26) or mandatory boundary checks. They all claim that it adds just a little (on average). But what if the benefit is also small? If we take the average cost (which is typically small for most things you can add) we should also take the average benefit. That might not be much larger since lots of people, even Linux users, will never run "perf record". Even among developers, profiling might be restricted to self-built binaries. This is not my own situation: I do profiling of packaged binaries quite regularly. But we should get ourselves out of the way and think about the larger user base. The people that will profile packaged binaries are likely the packagers themselves, so in this bubble we're a bit biased.
I want openSUSE to be a great place for people to develop and optimize workloads on, especially desktop ones, where most of the tooling we have for tracing and profiling is broken without frame pointers (see Sysprof and Hotspot from GNOME and KDE respectively, which both rely on frame pointers to have cheap real-time tracing for performance analysis).
I don't know about any of those, but I'd assume they're just GUIs around "perf"? If you have an Intel CPU since Haswell (Zen 4 or so should also have LBR, but I haven't tried it yet), "perf record --call-graph lbr" works even without frame pointers, and in my experience pretty reliable. Just to make clear: I don't want this to be seen as argument against frame pointers, but I think the case isn't terribly clear. It's a feature that mostly benefits package maintainers and other people that want to tweak the distro, when they're trying to investigate performance issues, while the costs, however small they may be, are paid by everybody all the time. Aaron