[Bug 883565] New: Race condition in systemd startup/shutdown affecting CIFS remote mounts
https://bugzilla.novell.com/show_bug.cgi?id=883565 https://bugzilla.novell.com/show_bug.cgi?id=883565#c0 Summary: Race condition in systemd startup/shutdown affecting CIFS remote mounts Classification: openSUSE Product: openSUSE 13.1 Version: Final Platform: x86-64 OS/Version: openSUSE 13.1 Status: NEW Severity: Major Priority: P5 - None Component: Network AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: rodney.baker@iinet.net.au QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 With the default service/unit-file dependencies as shipped with openSuSE 13.1, the remote-fs service is started before if-up controlled network interfaces are available, and is shut down after the network interfaces. This results in remote mounts that should be mounted by remote-fs.service (for example, cifs mounts defined in /etc/samba/cifstab) failing to mount at startup, and if mounted manually, not being unmounted cleanly at shutdown. The shutdown process also hangs until the unmount job times out since the network has been taken down before the remote file systems are unmounted. To fix this locally, I had to add the line: Requires=network@eth0.service to /usr/lib/systemd/system/remote-fs.service. This ensures that remote-fs.service is not started before the network interface is up and that the network interface is not stopped until remote-fs.service has exited during shutdown. I tried using "Requires=network.service" and "Requires=network.target" - these fixed mounting at start up but did not fix the delayed shutdown. Unforunately my hack is too system specific because it requires that the correct interface name is known when remote-fs.service is created. This will not be the same on every system and, in the case of multiple network interfaces, there may be multiple dependencies. Although it works as a short-term workaround a more permanent and more generic solution is needed. Reproducible: Always Steps to Reproduce: 1. Define remote smb mounts in /etc/samba/cifstab 2. Have LAN interface controlled by if-up 3. Startup the system - remote cifs mounts will not be mounted. 4. Manually mount the cifs shares (systemctl restart remote-fs will do it). 5. Shutdown the systme - the system will hang during shutdown because the network interface is stopped before the cifs shares are unmounted. Actual Results: As described above. Expected Results: The remote-fs service should only be started after the relevant network interface(s) are started and online. The remote shares should be unmounted before the relevant network interface(s) is/are shut down. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c1
Andrey Borzenkov
to /usr/lib/systemd/system/remote-fs.service.
Please show output of "rpm -qif /usr/lib/systemd/system/remote-fs.service" -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c2
--- Comment #2 from Rodney Baker
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c3
--- Comment #3 from Andrey Borzenkov
Sorry - correction - the file should be remote-fs.target
So what service mounts filesystems from /etc/samba/cifstab? remote-fs.target is pure synchronization point and does not do anything by itself. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c4
--- Comment #4 from Andrey Borzenkov
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c5
--- Comment #5 from Rodney Baker
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c6
--- Comment #6 from Rodney Baker
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c7
Andrey Borzenkov
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c8
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c9
--- Comment #9 from Andrey Borzenkov
(In reply to comment #7)
This requires that in wicked.service there is a line
Wants=network-online.target
No, it does not. Systemd automatically adds Wants=network-online.target to legacy sysvinit services. That's the main point of upstream patch. This requires that network-online.target comes after wicked services; I assumed this is already the case? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c10
--- Comment #10 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c11
--- Comment #11 from Rodney Baker
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c12
--- Comment #12 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c13
--- Comment #13 from Rodney Baker
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c14
--- Comment #14 from Andrey Borzenkov
This will interfere with 0018-Make-LSB-Skripts-know-about-Required-and-Should.patch
Could you explain. My patch is on top of 0018-Make-LSB-Skripts-know-about-Required-and-Should.patch and complements it by adding compatible upstream behavior. It does not change what 0018-Make-LSB-Skripts-know-about-Required-and-Should.patch does.
and insserv-generator.patch
You are right, I missed it. I submitted SR#238415 which updates insserv-generator.patch to use network-onlilne.target for $network. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c15
--- Comment #15 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c16
--- Comment #16 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c17
Marius Tomaschewski
(In reply to comment #7)
This requires that in wicked.service there is a line
Wants=network-online.target
as otherwise we get into trouble with OS 13.2 and SLES12
Marius?
Yes. Fix is in git master and will be submitted today. [https://github.com/openSUSE/wicked/commit/e6d1f904ba] -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c18
--- Comment #18 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c20
--- Comment #20 from Andrey Borzenkov
For openSUSE 13.1 I have to use
%if 0%{?suse_version} <= 1310 # # Older versions like oS 13.1 do not distinguish between # network.target and network-online.target # for f in src/core/service.c src/insserv-generator/insserv-generator.c do sed -ri '/"network",.*SPECIAL_NETWORK_ONLINE_TARGET,/{ s/SPECIAL_NETWORK_ONLINE_TARGET/SPECIAL_NETWORK_TARGET/}' $f done %endif
to make it work
?? That is what we have right now. And it is broken. I think I understand the problem. In 13.1 network@<if>.service is not part of transaction. It is started explicitly by /etc/init,d/network via "systemctl start network@<if>". This means we just replaced one race condition with another one. If job for network@<if> is queued *before* network-online.target, it works. But if network-online@target happens to be finished before, we lose. Note that upstream commit 58e027023b47b32e42cf93dd4a629b869ee1ef25 adds explicit After=network.target to network-online.target. This should fix it - it ensures that network-online.target is started only after /etc/init.d/network is finished, which means all jobs for network@if are submitted. I added backport of this patch to my repo, which is rebuilding right now. Could you test if it works for you? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c21
--- Comment #21 from Andrey Borzenkov
(In reply to comment #8)
(In reply to comment #7)
This requires that in wicked.service there is a line
Wants=network-online.target
as otherwise we get into trouble with OS 13.2 and SLES12
Marius?
Yes. Fix is in git master and will be submitted today. [https://github.com/openSUSE/wicked/commit/e6d1f904ba]
Folks, that is totally wrong, please do not do it. network-online.target should be pulled in by *consumers* of network, not by *providers*. What you must do, is to order wicked.service before network-online.target, but this is automatic with commit 58e027023b47b32e42cf93dd4a629b869ee1ef25 from upstream. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c22
--- Comment #22 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c23
--- Comment #23 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c24
--- Comment #24 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c25
Marius Tomaschewski
(In reply to comment #17)
(In reply to comment #8)
(In reply to comment #7)
This requires that in wicked.service there is a line
Wants=network-online.target
as otherwise we get into trouble with OS 13.2 and SLES12
Marius?
Yes. Fix is in git master and will be submitted today. [https://github.com/openSUSE/wicked/commit/e6d1f904ba]
Folks, that is totally wrong, please do not do it. network-online.target should be pulled in by *consumers* of network, not by *providers*. What you must do, is to order wicked.service before network-online.target, but this is automatic with commit 58e027023b47b32e42cf93dd4a629b869ee1ef25 from upstream.
So we have to remove this again. :-( Done in https://github.com/openSUSE/wicked/pull/291 wicked.service contains a Before=...network-online.target... Andrey, could you review if it is OK so far now again? https://github.com/openSUSE/wicked/blob/6a5cdf4889b29b3a1c498bf8804d6f7fd1ef... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c26
--- Comment #26 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c27
--- Comment #27 from Marius Tomaschewski
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c28
--- Comment #28 from Marius Tomaschewski
I'd like to suggest to use WantedBy=network-online.target
I'll add it when it is correct to add it. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
From manual page man:systemd.unit(5) WantedBy=, RequiredBy= This option may be used more than once, or a space-separated list of unit names may be given. A symbolic link is created in the .wants/ or .requires/ directory of each of the listed units when this unit is installed by systemctl enable. This has the effect that a dependency of type Wants= or Requires= is added from the listed unit to the current unit. The primary result is that the current unit will be started when the listed unit is started. See
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c29
--- Comment #29 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c30
Andrey Borzenkov
systemd-networkd-wait-online.service which has [Install] WantedBy=network-online.target
... I wonder if this should go into wicked.service as this one is what the upstream service does.
But this is backward! WantedBy != Wants, it is exactly reverse dependency. (In reply to comment #27)
Andrey,
do we need a WantedBy instead?:
[Install] WantedBy=network-online.target
Well ... upstream has two different units - service that actively configures networking and service that passively checks (or waits) for network to be configured. Service, that waits for network, is not started automatically to avoid delays on startup for everyone. That's the reason of WantedBy - to pull in service to wait for network if someone needs to wait. As I understand, there is no such difference in wicked. wicked.service is synchronous, right? When wicked.service is has finished startup, networking is configured. So it is enough to have it ordered before network-online.target as it already is. So no, I think it is not required. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c31
--- Comment #31 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c32
--- Comment #32 from Andrey Borzenkov
(In reply to comment #30)
... yes and no ... as the future of wicked will become asynchronous
Then it should provide second service wicked-online or similar.
(AFAIK) and therefore I'd like to see this WantedBy=network-online.target to avoid trouble in future ;)
How can WantedBy on *wicked* service help in this case?!? Then you will need WantedBy on service that will wait until network is online (wicked-online.service or whatever). Having WantedBy on a service that just triggers network configuration without waiting for it is pointless. That is exactly the problem we try to fix here, in this bug report.
Don't know how dail-in on demand will fit into this scheme but I guess if the device is up the network is somehow online.
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c33
--- Comment #33 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c34
--- Comment #34 from Andrey Borzenkov
This works only because network-online.target will be wanted by service_load_sysv_path() for the LSB services. Without LSB scripts any other unit depending on network-online.target would also require a Wants=network-online.target
I lost you here. network-online.target is by design intended to be *EXPLICITLY* pulled in by units that want to be delayed on startup until network is configured. For those cases when units are autogenerated by systemd (like network mounts) dependency is added automatically. For units that you ship manually you need to add this manually to unit file if unit really needs it. So what exactly are you arguing here? Sorry, I do not get it. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c35
--- Comment #35 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c36
--- Comment #36 from Andrey Borzenkov
If there is a unit written by a user/customer which has network-online.target in its After= line and no LSB script is enabled then this unit may fail as it could be up before network
I'm afraid we are going in circles. As my last comment: - short version - This is a bug in respective unit and user/customer needs to fix it. - long version - network-online.target should be explicitly pulled in by units that need it. Quoting http://www.freedesktop.org/software/systemd/man/systemd.special.html: network-online.target Units that strictly require a configured network connection should pull in network-online.target (via a Wants= type dependency) and order themselves after it. And some more detailed explanation in http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c37
--- Comment #37 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=883565
https://bugzilla.novell.com/show_bug.cgi?id=883565#c39
--- Comment #39 from Dr. Werner Fink
http://bugzilla.novell.com/show_bug.cgi?id=883565
Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=883565
Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=883565
Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=883565
Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=883565
http://bugzilla.novell.com/show_bug.cgi?id=883565#c40
--- Comment #40 from Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=883565
http://bugzilla.novell.com/show_bug.cgi?id=883565#c41
Franck Bui
participants (1)
-
bugzilla_noreply@novell.com