backtraces, unwinding and changing consistency model
Hello, if people follow the git repo of ulp they might have noticed that the consistency model of checking library entries and exits was disabled. This is to describe the background for that (if for nothing else than us not forgetting the reasons :-) ). So, the consistency model basically answers the question "is this livepatch currently safe to activate for that thread?". There are many methods that can be designed for this (and it includes the trivial "yes!" model), but fairly from the beginning we settled on a model that would track library entries and exits, and only consider live-patch application safe if the library affected by the live patch was not currently active (on the call stack). There are certain conditions and fact that need to be taken into account for this to work: * for library entry tracking you need exit tracking * you need to do the tracking right from process start, even without any live patches loaded or active; otherwise you might see unbalancend entry/exits and the library might be considered non-entered even when it's entered at patch application tyime * we want the tracking to be per-thread, otherwise one thread might block application of the live patch for the whole process indefinitely We do the entry tracking by function entry point redirection (for exported functions). That's easy. Exit tracking is harder. Without much toolchain help we can't redirect the return path, so we resorted to frame stealing: * in the entry tracker we modify the original return address (on stack or return register) to point to the exit tracker, and store the original return address at $place; the exit tracker then restores things and ultimately returns to the original return address Now, there are further considerations for $place: * we can't modify the upward stack: they might hold function arguments * we can't use downward stack: it will be considered changable by the called functions * we can't allocate our own space on stack: function arguments to the original callee are relative to the stack pointer (or frame pointer derived from it), so changing the stack pointer would invalidate that. Even if we consider addresses of stack slots for argument passing to be undefined we still would need to copy the incoming arguments to a new place, with the problem that the entry tracker doesn't know _how much_ to copy (in the general case, e.g. variadic functions, only the caller knows how much space the arguments needed on the stack). So, we can't have $place be on the stack. But it needs to be per-thread. So, there's only one possibility: thread-local storage in one or another way. That's indeed the solution taken by libpulp, it stores the original return address into some TLS space and all was good. Well, except that backtraces don't work then. libpulp took precaution to mark the stolen frames as not backtracable (i.e. as stop points for back traces), and at least you would see in gdb that the exit tracker was the top-most frame and be reminded that something special was going on. But backtraces aren't only a pure debug facility. They are used for frame unwinding while throwing exceptions and for pthread cancellation, so backtraces not working over shared library borders also mean exceptions and thread cancellation not working over shared library borders. That's of course not acceptable. So, ideally we would fix backtraces to do work with the entry/exit tracking. Turns out that this isn't entirely trivial. During the process of unwinding there is one thing that needs to be done for each unwound frame, amongst other things: given the current frame and register state, get the return address of the calling frame. This requires certain pieces of information about the frames, for which there are multiple possiblities: windows uses standardized code layout plus custom info pieces per function, linux most often uses DWARF unwind information to describe frame layout and return address (arm and aarch64 uses some more compact form). The important thing to know is that the unwinder has some internal state of the current frame and when trying to get the return address for it, it essentially is given a symbolic expression (on linux a DWARF expression) that the unwinder interprets to calculate the value. (One example of such symbolic expression would be: "add 8 to the frame pointer; that's the place containing the current return address"). This expression comes from the program itself (e.g. in the .eh_frame section), so the program itself can specify how the unwinder needs to calculate things according to how the program was produced. So, for the stolen frame the return address is stored at some TLS place, so that's what the DWARF expression for the return address needs to say. Luckily there are DWARF operations that specify TLS addresses: DW_OP_GNU_push_tls_address and DW_OP_form_tls_address, i.e. a GNU extension or DWARF v3, so we can describe the situation we have in the unwind information. Very and extremely unfortunately our normal unwinder (in libgcc_s) doesn't support this operation in its expression interpreter :-/ What's worse, its structure is basically this: switch (dwarf->opcode) { ... nothing with tls ... default: abort(); } i.e. whenever the current unwinder would hit onto a frame that used the DWARF TLS opcodes it would abort the program. That's even worse than unwinding not working for one thread. Now, we could fix that unwinder (which we'll do). That leaves an unknown number of other unwinders in the world: they might be statically linked variants of the libgcc_s unwinder, or they might be completely different unwinders from unknown sources; but it's reasonable to assume that they don't implement the TLS opcodes either (no matter if they abort on unknown opcodes or not). Either way, simply using the dwarf TLS opcodes in our libraries exposes us to potential and unknown instability due to non-support in the unwinders, something that live patching is supposed to protect against ;-) We have no real solution to the above; a few can be imagined though: * fix all unwinders to accept TLS opcodes * make $place not be TLS (seems impossible, but who knows?) * don't steal frames: exit tracking then needs to be done different, e.g. by toolchain improvements to also be able to redirect the return paths of functions * don't do entry/exit tracking at all For now we resort to the last of these, ditch entry/exit tracking. This gives us a much weaker consistency model, and hence more care needs to be applied when constructing a live patch. We deem that to be okay for now. We want to make it so that a live patch can request a certain consistency model, so that in the future a entry/exit tracker, or something completely different, can be (re)implemented and enlarge the set of possible live patches. (One obvious alternative consistency checker would be one looking at backtraces itself to see if problematic functions are currently active). There's an advantage to not doing library entry/exit tracking: performance. The indirection through the tracking code, in particular the need to use TLS for the original return address and the "we're-in-that-lib" flag, are not cheap. Together with the fact that lib entry/exit has to be tracked right from process start makes the performance impact for userspace live patching noticable (not terribly so, but measurable). Obviously without such tracking we don't pay that cost at all. At least something ;-) Ciao, Michael.
participants (1)
-
Michael Matz