Am Donnerstag, 6. Mai 2021, 16:57:17 CEST schrieb Jimmy Berry:
On Wednesday, May 5, 2021 3:00:06 PM CDT Thorsten
- Is the
bug against TW since it updated a package such that the container
would be incompatible with other hosts such as Leap and thus break the
fundamental feature of a TW container.
No, the change in TW was correct.
Alright, so TW containers are not for serious workloads since incredibly
common functionality can be "correctly" broken for months. Understood, thanks.
The "correctness" here is not about nitpicking interpretations of some
standard, it's literally this:
faccessat2(AT_FDCWD, "/", R_OK, 0) returns -EPERM for everything!
That applies to *all* syscalls introduced after a certain point in the past and
thus only gets worse in the future. For non-coders: It's like running on a CPU
with some pins cut off.
TW blocks for all sorts of ring failures caused by
upgrades, many times
"correct" to one package that exposes a bug in another. The whole point of QA
and our promise to users is that openSUSE does not release a borked TW
intentionally. Apparently this does not apply to the containers and should be
It does. openQA currently fully tests Tumbleweed containers only on Tumbleweed,
and that always worked. Leap and SLE also perform some tests with the
Tumbleweed container, but those didn't actually hit this particular bug as they
don't go deep enough.
I filed a ticket about testing TW containers on other distros a while ago:
However, this was caught pretty soon after the release, so while we would've
been aware of the breakage before publishing the affected snapshot, it would
still require a fix:
a) If caught before publishing, revert the breaking change? This was caused by
a glibc version update, which is rather painful to revert.
b) Apply a workaround to glibc? The maintainer was *vehemently* opposed to that
option, so that was the end of that. AFAICT a temporary workaround at least
for x86_64 would've been feasible, especially considering the severity and
c) Block Tumbleweed publishing until the runtime is fixed? While this is
potentially an option for Leap and SLE because we can handle it ourselves
(or at least influence it in a meaningful way), this is not really the case
for third-party platforms like GitHub.
I would also blame the container runtime and providers there, because the TW
container did everything correctly. The nature of the buggy behaviour in the
runtime also made it rather tricky to address, because it was actually a whole
set of syscalls which were broken in totally unexpected ways. Unfortunately
it took quite some time to get this fixed upstream (but before this landed in
TW) and even longer to actually have it deployed in third-party
Other distributions using glibc 2.33 versions were affected in the exact same