[opensuse-factory] GCC 4.6 for 12.1
Hi all -

I know someone else mentioned this briefly WRT to Tumbleweed, but I'd
like to bring up for discussion the adoption of gcc 4.6 for Factory.

A bit of background. We currently maintain a kernel-trace flavor to
allow people to make use of the tracing facilities. Historically, there
has been overhead associated with these, which is why they were in their
own flavor. Most users don't care about them, so there's no sense in
slowing everyone down to satisfy them. Over time, the maintainers of
these facilities have worked to lower the overhead to the point where
now dynamic ftrace only adds memory overhead. It uses gcc's profiling
facility to call out to a function at the beginning of every function
call to allow tracking. At boot, this is a function that returns
immediately but during the boot process those calls are hot-patched to
nop operations so that they carry virtually zero runtime overhead.

This sounds great, right? There's one last bit, and that is actually the
reason why ftrace is still disabled in our regular kernel flavors. The
compiler's profiling feature calls the profiling function after the
function prologue. That means that the stack pointer has already been
advanced and since it can advance a different amount based on the needs
of each function, we don't have an easy way to get the caller's stack
frame back. It's needed to resolve where the caller was called from and
the only way to do that is to dedicate a register to track the start of
the current stack frame. Dedicating a register for this increases
register contention and forces more accesses to CPU cache or main
memory, slowing everything down. So, it's not worth it when most users
will never actually take advantage of the feature that requires it.

For some time, I've been trying to come up with ways to work around this
so that the fast path stays fast and the tracing path takes any
performance hit necessary. Until recently, the performance hit would've
been too big to make it worthwhile. With gcc 4.6, we have the -mfentry
option which moves the call to the profiling function *before* the
function's prologue. This means that the caller's stack frame is still
intact and we can resolve the caller's parent without issue. I've spent
the past few days working up a proof of concept that takes advantage of
this and doesn't require a dedicated register. It turns out that Steven
Rostedt has already posted a patch set to do exactly this and he's
targeting 2.6.40 for inclusion in the mainline kernel. For now it looks
like it's x86-64 only but it should be ok for i386 as well.

This would go a long way to eliminate the -trace flavor from our kernel
packages. The benefits are multiple: It eliminates the maintenance of
another kernel flavor, but it also allows everyone to use tracing
facilities without installing a separate kernel release to do so.

One last bit is needed, though, and that is for gcc to allow using -pg
- -mfentry with -fomit-frame-pointers. Once that bit is taken care of then
we can rid the kernel of the frame pointer requirement for ftrace and
eliminate the -trace flavor entirely.

-Jeff

Jeff Mahoney
