[Bug 1020301] New: dbus service restart in TW 20170112 update crashed sytemd-logind
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301 Bug ID: 1020301 Summary: dbus service restart in TW 20170112 update crashed sytemd-logind Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: SUSE Other Status: NEW Severity: Major Priority: P5 - None Component: Upgrade Problems Assignee: bnc-team-screening@forge.provo.novell.com Reporter: martin.wilck@suse.com QA Contact: jsrain@suse.com Found By: --- Blocker: --- Created attachment 710346 --> http://bugzilla.opensuse.org/attachment.cgi?id=710346&action=edit journalctl output during update This is a follow-up on the disussion on opensuse-factory ML subject "CAREFUL: New Tumbleweed snapshot 20170112 released" The problem occured while I was running "zypper dup --no-allow-vendor-change" in screen session in a gnome-terminal window under GNOME. Admittedly, this was risky business. On my system, "systemctl daemon-reload" happened 9 times during the update which eventually crashed. But that alone wasn't fatal. The fatal problem was the restart of the dbus service, which caused various other services to be stopped and restarted as well, including systemd-logind. systemd and other services failed to create dbus connections. In the wake of these events, the gdm session and X crashed. systemd was again reloaded while services were restarted. systemd started to emit the error message "Looping too fast. Throttling execution a little" - probably while it was trying to restart systemd-logind. The restart of the systemd-logind service eventually failed, which explains why I wasn't able to log in on the console to see what went wrong. The logs show that rpm continued updating packages in spite of these errors. Even the initrd seems to have been rebuilt. But no btrfs "post" transaction snapshot has been created, so the zypper transaction didn't fully succeed. AFAICS, the problem was caused by the restart of the dbus service in the %postuninstall section of the dbus-1 package. On my system, DISABLE_RESTART_ON_UPDATE in /etc/sysconfig/services has the default value "no", so in a way this behaved as I configured it. But maybe the dbus service should be an exception from this rule, or should be controlled by a separate option, e.g. "ENABLE_DANGEROUS_RESTART_ON_UPDATE='yes'". A log excerpt is attached. At the end it shows my attempts to determine the status of the system using sysrq, as no login was possible any more. I recovered from this problem by booting from a btrfs snapshot and doing the upgrade again in a text console. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c1
--- Comment #1 from Martin Wilck
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c2
Dominique Leuenberger
I said it above already but let me state it more clearly:
I think there should be a category of "dangerous" services which aren't restarted even if DISABLE_RESTART_ON_UPDATE="no", and that dbus should be on this roster.
Not really - a service that knows of itself that it should not restart, has to simply export DISABLE_RESTART_ON_UPDATE="yes" in the post scripts (as does PackageKit for example) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
René Krell
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c3
Fabian Vogt
(In reply to Martin Wilck from comment #1)
I said it above already but let me state it more clearly:
I think there should be a category of "dangerous" services which aren't restarted even if DISABLE_RESTART_ON_UPDATE="no", and that dbus should be on this roster.
Not really - a service that knows of itself that it should not restart, has to simply export DISABLE_RESTART_ON_UPDATE="yes" in the post scripts (as does PackageKit for example)
Could that be done in a fixed dbus-1 package and submitted to :Update to prevent further issues? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c4
--- Comment #4 from Dominique Leuenberger
Could that be done in a fixed dbus-1 package and submitted to :Update to prevent further issues?
Whatever solution: it will happen at least once more: the postun script of the packages installed on users machines have the restart encoded. This script will be executed whenever dbus-1 is being updated (again). Any handling with :Update only introduces more issue with it - not less. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c5
--- Comment #5 from Fabian Vogt
(In reply to Fabian Vogt from comment #3)
Could that be done in a fixed dbus-1 package and submitted to :Update to prevent further issues?
Whatever solution: it will happen at least once more: the postun script of the packages installed on users machines have the restart encoded. This script will be executed whenever dbus-1 is being updated (again). Any handling with :Update only introduces more issue with it - not less.
No, as %postun and %preun of the old package get run _after_ the new package gets installed. Best way would probably be to disable restarting dbus.service and dbus.socket completely and only allow reloading. This would also affect the %postun in turn. However, I have no idea how to do that. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c6
--- Comment #6 from Dominique Leuenberger
No, as %postun and %preun of the old package get run _after_ the new package gets installed.
So? it's still the old scripts that are being executed. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c7
--- Comment #7 from Martin Wilck
Best way would probably be to disable restarting dbus.service and dbus.socket completely and only allow reloading. This would also affect the %postun in turn. However, I have no idea how to do that.
Like this, maybe?
%define _backup /etc/sysconfig/services.rpmbak.%{name}-%{version}-%{release}
%pre
if [[ "$FIRST_ARG" -gt 1 ]]; then
[...]
if [[ -f /etc/sysconfig/services ]]; then
cp -a /etc/sysconfig/services %{_backup}
else
touch %{_backup}
fi
cat >>/etc/sysconfig/services <
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c8
--- Comment #8 from Martin Wilck
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
Simon Lees
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c9
--- Comment #9 from Dominique Leuenberger
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c10
--- Comment #10 from Simon Lees
(In reply to Fabian Vogt from comment #5)
No, as %postun and %preun of the old package get run _after_ the new package gets installed.
So? it's still the old scripts that are being executed.
Ill look into this during the week, I was away last week. From memory even though the old %postun script may still be running we might be able to export DISABLE_RESTART_ON_UPDATE="yes" before which would still work but I need to experiment with it first. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301
http://bugzilla.opensuse.org/show_bug.cgi?id=1020301#c11
--- Comment #11 from Simon Lees
(In reply to Fabian Vogt from comment #5)
Best way would probably be to disable restarting dbus.service and dbus.socket completely and only allow reloading. This would also affect the %postun in turn. However, I have no idea how to do that.
Like this, maybe?
%define _backup /etc/sysconfig/services.rpmbak.%{name}-%{version}-%{release}
%pre if [[ "$FIRST_ARG" -gt 1 ]]; then [...] if [[ -f /etc/sysconfig/services ]]; then cp -a /etc/sysconfig/services %{_backup} else touch %{_backup} fi cat >>/etc/sysconfig/services <
%posttrans if [[ -s %{_backup} ]]; then mv -f %{_backup} /etc/sysconfig/services elif [[ -e %{_backup} ]]; then rm -f /etc/sysconfig/services fi
This is basically what I implemented in https://build.opensuse.org/request/show/452341 you were pretty close but you need a %global instead of a %define :-). I have also added "export DISABLE_RESTART_ON_UPDATE=yes" to the postun so we can drop the more complex code after everyone migrates to a newer version (the service files were added in the last update which caused the issue) (In reply to Dominique Leuenberger from comment #9)
RefuseManualStop=true might be an interesting option for the dbus.service
It is a interesting option so i've added it but it doesn't affect this case, it doesn't block restarting a service only manually stopping. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com