On Thu, Nov 28, 2024 at 02:08:25AM +0100, Aaron Puchert wrote:
Am 27.11.24 um 04:42 schrieb Andrii Nakryiko:
On Tue, Nov 26, 2024 at 6:43 PM Aaron Puchert <aaronpuchert@alice-dsl.net> wrote:
Am 25.11.24 um 23:59 schrieb Andrii Nakryiko:
[...] DWARF's .eh_frame-based stack unwinding tables. The latter is actually doesn't land itself well to system-wide profiling, and involved compromises in memory usage, CPU usage, and just plain reliability (no one will guarantee you that you captured all of the stack, for example). The first two arguments are problematic: remember that we're talking about frame pointers, which cost CPU for all users. On one hand you're But how much CPU does it actually cost? You can see 1-2% in *benchmarks*,
Sure, that was my understanding. But 1% across everything is not a small regression. I know you're arguing that lots of tools don't even support DWARF unwinding, but let's just focus on those that do: even if 2% of users does profiling, and the overhead is 20%, we have an average CPU usage cost of 0.4% for DWARF, which is still below 1%. (That's assuming those users are profiling all the time.) So total CPU usage across all users is going to go up if we enable FP.
That's not how it works. Most code is used rarely, and performance of such code is largely irrelevant, even for slowdowns much bigger than 1%. On the other hand, there are code paths which are performance critical and are in need of optimization. Having frame pointers results in having more and easier to use tools for identifying the code in which performance matters. That is the first step towards optimizing it, and reducing the total CPU time needs of the system, for all users. It does not have to be frame pointers in general. They just happen to be a feature that helps making performance measuring more usable today. Thanks Michal