[Bug 1018399] New: SUSE-RU-2017:0013-1 causes assert in PID1
http://bugzilla.opensuse.org/show_bug.cgi?id=1018399 Bug ID: 1018399 Summary: SUSE-RU-2017:0013-1 causes assert in PID1 Classification: openSUSE Product: openSUSE Distribution Version: 13.2 Hardware: x86-64 OS: SLES 12 Status: NEW Severity: Major Priority: P5 - None Component: Basesystem Assignee: bnc-team-screening@forge.provo.novell.com Reporter: wullinger@rz.uni-kiel.de QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:50.0) Gecko/20100101 Firefox/50.0 Build Identifier: As requested, this is a new bug report wrt to the update supposed to fix bug 909418 [https://bugzilla.suse.com/show_bug.cgi?id=909418] The "bound device" update seems not to properly handle the case when the device is already gone by the time the mount unit is updates. In particular Dell (and other) servers come with a virtual USB device in their remote access controllers. These frequently get mounted under /tmp/SECUPD by the official utilities, usually from /dev/sdb?. However, the device sometimes seems to disappear while still mounted. As of suse-ru-20170013-1, which imports commit ebc8968bc0b6fc460099041f5ae1262ca17eeb6e, this results in an unmanageable machine, because systemd ABRTs in PID 1. Relevant journal entries: systemd[11505]: Assertion 'dev' failed at src/core/device.c:301, function device_is_bound_by_mounts(). Aborting. FAT-fs (sdc): Volume was not properly unmounted. Some data may be corrupt. Please run fsck. systemd[1]: Assertion 'dev' failed at src/core/device.c:301, function device_is_bound_by_mounts(). Aborting. systemd[1]: Caught <ABRT>, dumped core as pid 30128. systemd[1]: Freezing execution. At this point in time /tmp/SECUPD still appears mounted at can be unmounted. Problematic code path: It looks like it is valid for dev to be NULL in device_setup_unit(Manager *m, struct udev_device *dev, const char *path, bool main). At least there are various guards with if (dev) in the source. This is not checked before the (new) call to device_is_bound_by_mounts(), which itself does assert(dev). As a result, PID1 aborts and the system cannot be managed any more. I'm not familiar with the new logic, but the fix may be a simple guard condition inside device_is_bound_by_mounts(). Reproducible: Always Steps to Reproduce: 1. Have a Dell Server 2. Install SLES 12 3. Have Dell tools installed and started at boot 4. Install suse-ru-20170013-1 Actual Results: Systemd asserts in PID1 with systemd[11505]: Assertion 'dev' failed at src/core/device.c:301, function device_is_bound_by_mounts(). Aborting. systemd[1]: Assertion 'dev' failed at src/core/device.c:301, function device_is_bound_by_mounts(). Aborting. systemd[1]: Caught <ABRT>, dumped core as pid 30128. systemd[1]: Freezing execution. System is unmanagable, must be hard rebooted (systemctl -ff reboot) Expected Results: System remains in good health. Possibly: Filesystem is unmounted. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1018399 http://bugzilla.opensuse.org/show_bug.cgi?id=1018399#c1 --- Comment #1 from Peter Wullinger <wullinger@rz.uni-kiel.de> --- Created attachment 708792 --> http://bugzilla.opensuse.org/attachment.cgi?id=708792&action=edit Add (dev) guard condition to device_is_bound_by_mounts() This is a bona fide patch to handle the case when device_is_bound_by_mounts(Unit *d, struct udev_device *dev) is called with (dev == NULL), which seems to be a valid parameter combination in the calling function. It removes the problematic assertion and instead guards the udev property query by a non-NULL check. I'd really like this to be reviewed by someone more familiar with the new bind_mounts functionality. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1018399 Peter Wullinger <wullinger@rz.uni-kiel.de> changed: What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugzilla.suse.com/s | |how_bug.cgi?id=909418 -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com