[opensuse-kernel] RFC: dwarf2 unwinder for 10.3
Hallo, Should we add the dwarf2 unwinder or not for 10.3? Kernel oopses use approximate backtracing and also report old left over garbage on the stack. In some cases this can make oopses very hard to read. Jan Beulich ported the NLKD dwarf2 unwinder to the kernel to solve this problem. It was merged to mainline, but due to some early teething problems Linus drew it out. In the end it was relatively stable though. I'm wondering if we should add it to the opensuse 10.3 kernel anyways. Advantages - Better backtraces for kernel oopses. Saves developer time which is very valuable. - Would help testing it further Disadvantages: - Might still have bugs (but in this case we fall back and should not lose any information) - Increases kernel binary size in memory by ~10% for the unwind information. Opinions? -Andi -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Wed, Aug 15, Andi Kleen wrote:
Should we add the dwarf2 unwinder or not for 10.3?
Yes, I want to see it in 10.3.
Kernel oopses use approximate backtracing and also report old left over garbage on the stack. In some cases this can make oopses very hard to read.
Jan Beulich ported the NLKD dwarf2 unwinder to the kernel to solve this problem.
It was merged to mainline, but due to some early teething problems Linus drew it out. In the end it was relatively stable though.
I'm wondering if we should add it to the opensuse 10.3 kernel anyways.
Advantages - Better backtraces for kernel oopses. Saves developer time which is very valuable. - Would help testing it further Disadvantages: - Might still have bugs (but in this case we fall back and should not lose any information) - Increases kernel binary size in memory by ~10% for the unwind information.
IMHO the 10% more memory for unwind information is a good investment for better backtraces. Since our distro isn't really optimized for small memory footprint I don't think it hurts very much. How long does a backtrace take compared to the old code? I see you already care about the NMI. I guess it still takes too long for being called for oprofiles callgraph backtrace? Are there any ways how we can speed this up? -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
How long does a backtrace take compared to the old code? I see you already care about the NMI. I guess it still takes too long for being called for oprofiles callgraph backtrace? Are there any ways how we can speed this up?
After initial performance problems (iirc lockdep added heavy use of the unwind routines), I had added creation of a binary lookup table (pretty new binutils can create it at build time), so performance shouldn't be that bad anymore. Jan -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
How long does a backtrace take compared to the old code?
It's slower; perhaps it would be worth testing if it disturbs oprofile too much. I don't have uptodate performance data. On the other hand the suse kernel currently doesn't have your patch to let oprofile not use its own backtracer, so it wouldn't use it. Other than oprofile lockdep uses it a lot so it might make the debugging kernel somewhat slower.
I see you already care about the NMI.
Did I?
I guess it still takes too long for being called for oprofiles callgraph backtrace? Are there any ways how we can speed this up?
Jan already fixed the worst performance problem in the original implementation. But it'll always be slower because it processes much more data than a dumb unwinder. -Andi -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
participants (3)
-
Andi Kleen
-
Jan Beulich
-
Jan Blunck