Hello,
if people follow the git repo of ulp they might have noticed that the
consistency model of checking library entries and exits was disabled.
This is to describe the background for that (if for nothing else than us
not forgetting the reasons :-) ).
So, the consistency model basically answers the question "is this
livepatch currently safe to activate for that thread?". There are many
methods that can be designed for this (and it includes the trivial "yes!"
model), but fairly from the beginning we settled on a model that would
track library entries and exits, and only consider live-patch application
safe if the library affected by the live patch was not currently active
(on the call stack).
There are certain conditions and fact that need to be taken into account
for this to work:
* for library entry tracking you need exit tracking
* you need to do the tracking right from process start, even without any
live patches loaded or active; otherwise you might see unbalancend
entry/exits and the library might be considered non-entered even when
it's entered at patch application tyime
* we want the tracking to be per-thread, otherwise one thread might block
application of the live patch for the whole process indefinitely
We do the entry tracking by function entry point redirection (for exported
functions). That's easy. Exit tracking is harder. Without much
toolchain help we can't redirect the return path, so we resorted to frame
stealing:
* in the entry tracker we modify the original return address (on stack
or return register) to point to the exit tracker, and store the original
return address at $place; the exit tracker then restores things and
ultimately returns to the original return address
Now, there are further considerations for $place:
* we can't modify the upward stack: they might hold function arguments
* we can't use downward stack: it will be considered changable by the
called functions
* we can't allocate our own space on stack: function arguments to the
original callee are relative to the stack pointer (or frame pointer
derived from it), so changing the stack pointer would invalidate that.
Even if we consider addresses of stack slots for argument passing to be
undefined we still would need to copy the incoming arguments to a new
place, with the problem that the entry tracker doesn't know _how much_
to copy (in the general case, e.g. variadic functions, only the caller
knows how much space the arguments needed on the stack).
So, we can't have $place be on the stack. But it needs to be per-thread.
So, there's only one possibility: thread-local storage in one or another
way. That's indeed the solution taken by libpulp, it stores the original
return address into some TLS space and all was good.
Well, except that backtraces don't work then. libpulp took precaution to
mark the stolen frames as not backtracable (i.e. as stop points for back
traces), and at least you would see in gdb that the exit tracker was the
top-most frame and be reminded that something special was going on.
But backtraces aren't only a pure debug facility. They are used for
frame unwinding while throwing exceptions and for pthread cancellation, so
backtraces not working over shared library borders also mean exceptions
and thread cancellation not working over shared library borders. That's
of course not acceptable.
So, ideally we would fix backtraces to do work with the entry/exit
tracking. Turns out that this isn't entirely trivial. During the process
of unwinding there is one thing that needs to be done for each unwound
frame, amongst other things: given the current frame and register state,
get the return address of the calling frame. This requires certain pieces
of information about the frames, for which there are multiple
possiblities: windows uses standardized code layout plus custom info
pieces per function, linux most often uses DWARF unwind information to
describe frame layout and return address (arm and aarch64 uses some more
compact form).
The important thing to know is that the unwinder has some internal state
of the current frame and when trying to get the return address for it, it
essentially is given a symbolic expression (on linux a DWARF expression)
that the unwinder interprets to calculate the value. (One example of such
symbolic expression would be: "add 8 to the frame pointer; that's the
place containing the current return address"). This expression comes from
the program itself (e.g. in the .eh_frame section), so the program itself
can specify how the unwinder needs to calculate things according to how
the program was produced.
So, for the stolen frame the return address is stored at some TLS place,
so that's what the DWARF expression for the return address needs to say.
Luckily there are DWARF operations that specify TLS addresses:
DW_OP_GNU_push_tls_address and DW_OP_form_tls_address, i.e. a GNU
extension or DWARF v3, so we can describe the situation we have in the
unwind information.
Very and extremely unfortunately our normal unwinder (in libgcc_s) doesn't
support this operation in its expression interpreter :-/ What's worse,
its structure is basically this:
switch (dwarf->opcode) {
... nothing with tls ...
default: abort();
}
i.e. whenever the current unwinder would hit onto a frame that used the
DWARF TLS opcodes it would abort the program. That's even worse than
unwinding not working for one thread.
Now, we could fix that unwinder (which we'll do). That leaves an unknown
number of other unwinders in the world: they might be statically linked
variants of the libgcc_s unwinder, or they might be completely different
unwinders from unknown sources; but it's reasonable to assume that they
don't implement the TLS opcodes either (no matter if they abort on unknown
opcodes or not). Either way, simply using the dwarf TLS opcodes in our
libraries exposes us to potential and unknown instability due to
non-support in the unwinders, something that live patching is supposed to
protect against ;-)
We have no real solution to the above; a few can be imagined though:
* fix all unwinders to accept TLS opcodes
* make $place not be TLS (seems impossible, but who knows?)
* don't steal frames: exit tracking then needs to be done different, e.g.
by toolchain improvements to also be able to redirect the return paths
of functions
* don't do entry/exit tracking at all
For now we resort to the last of these, ditch entry/exit tracking. This
gives us a much weaker consistency model, and hence more care needs to be
applied when constructing a live patch. We deem that to be okay for now.
We want to make it so that a live patch can request a certain consistency
model, so that in the future a entry/exit tracker, or something completely
different, can be (re)implemented and enlarge the set of possible live
patches.
(One obvious alternative consistency checker would be one looking at
backtraces itself to see if problematic functions are currently active).
There's an advantage to not doing library entry/exit tracking:
performance. The indirection through the tracking code, in particular the
need to use TLS for the original return address and the
"we're-in-that-lib" flag, are not cheap. Together with the fact that lib
entry/exit has to be tracked right from process start makes the
performance impact for userspace live patching noticable (not terribly
so, but measurable). Obviously without such tracking we don't pay that
cost at all. At least something ;-)
Ciao,
Michael.