On Mon, Nov 18, 2024 at 3:30 AM Richard Biener <rguenther@suse.de> wrote:
On Sat, 16 Nov 2024, Neal Gompa wrote:
Hello,
I would like to change openSUSE's default build flags in rpm and rpm-config-SUSE to incorporate frame pointers (and equivalents on other architectures) to support real-time profiling and observability on SUSE distributions.
In practice, this would mean essentially adopting the same tunables that exist in Fedora[1] for openSUSE to turn them on by default and to allow packages or OBS projects to selectively opt out as needed.
The reasons to do so are threefold:
* It is a major competitive disadvantage for us to lack the ability to do cheap real-time profiling and observability. Fedora[2], Ubuntu[3], Arch Linux[4], and AlmaLinux[5] all now do this, and thus support this capability. * The performance hit for having it vs not is insignitifcant[6]. * There are new tools in both the cloud-native and regular systems development worlds that leverage this, and openSUSE should be an enabler of those technologies.
I want openSUSE to be a great place for people to develop and optimize workloads on, especially desktop ones, where most of the tooling we have for tracing and profiling is broken without frame pointers (see Sysprof and Hotspot from GNOME and KDE respectively, which both rely on frame pointers to have cheap real-time tracing for performance analysis). And given that we advertise as the "makers' choice", I think it would definitely be on-brand for us to have the capability to better support makers and shakers.
For those interested in more detail about frame pointers, Brendan Gregg has a decent post about it[7]. I have also submitted a parallel request for openSUSE Leap 16 to have this feature enabled too[8].
I truly believe this would give us an even better footing with the broader community of developers and operators and make SUSE distributions very attractive for FOSS and proprietary software alike.
It seems this is optimizing the system for profiling it rather than using it which is an odd thing to do. I realize not having frame pointers can make accurate profiling more difficult (but I do this every day), still taking a 1-10% hit on cpython seems bad.
It's not just about making accurate profiling easier, it's also about making it cheap. Frame pointers make it so you can sample at any point on a running system. Users can do sampling as they observe problems and report to developers. Developers and operators can observe with real workloads without impacting the system configuration. This is why the Go compiler has had frame pointers on by default for 6 years. It's also why other operating systems have frame pointers on for non-x86_32. Linux is the outlier.
I also object to enforce this for x86 32bit which is a very register starved architecture (x86-64 is only slightly better in this regard).
Sure, we can leave it out by default for i586/x86_32.
I'll note -mno-omit-leaf-frame-pointer is x86 specific - is the proposal only directed to x86 and x86-64?
No. As I said, the idea is to use equivalent flags on all supported architectures.
How do you enforce frame pointers for JITed code, for code generated by compilers that are not GCC (rust, golang, etc.)? Or why do you choose to "ignore" profiling those?
Go already does this and has for most of the past decade (which is why cloud-native observability tools even exist at all). I'd like to turn it on in Rust as well. In Fedora, it *was* turned on in Rust, controlled by the same rpm macro that affected the compiler flags for relevant architectures[1]. [1]: https://pagure.io/fedora-rust/rust-packaging/blob/1402e757e3200e6f06b8a9c0db...
In any case - I propose shipping all packages with debug info included since that greatly improves the profiling experience - even more so than by enabling frame-pointers. Bandwidth and disk is cheap these days.
Actually, DWARF based profiling isn't very good, which is why so few people do it. It's slow, it's memory intensive, thus the sampling capability is much poorer. Richard W.M. Jones did a decent comparison about this and the problems with DWARF profiling over leveraging frame pointers[2]. Also, disk space available on systems has curiously enough remained mostly the same in 20 years. There was a brief period where storage *did* go up, but the introduction of flash storage brought storage back down on most computer systems. It's also wildly expensive to have a lot of storage on portable computers, which are now what most people have. And bandwidth costs vary based on what part of the world you live in. It's cheaper in Europe than it is in the Americas and especially in Asia and Africa. [2]: https://rwmj.wordpress.com/2023/02/14/frame-pointers-vs-dwarf-my-verdict/ -- 真実はいつも一つ!/ Always, there's only one truth!