[Bug 861489] New: system shutdown hangs when using NFS filesystemd via automount
https://bugzilla.novell.com/show_bug.cgi?id=861489 https://bugzilla.novell.com/show_bug.cgi?id=861489#c0 Summary: system shutdown hangs when using NFS filesystemd via automount Classification: openSUSE Product: openSUSE 13.1 Version: Final Platform: x86-64 OS/Version: openSUSE 13.1 Status: NEW Severity: Major Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: krienke@uni-koblenz.de QAContact: qa-bugs@suse.de Found By: --- Blocker: --- Created an attachment (id=576661) --> (http://bugzilla.novell.com/attachment.cgi?id=576661) system debug log User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36 I run several 13.1 installations. All systems that use automount in order to mount the users home directory as well as other NFS directories from a NFS server work fine but the system always hangs upon shutdown. Actually it takes a very long time to complete the shutdown (> 10min). The very same setup (also using systemd) with 12.3 works just fine. It also worked fine for all older suse version using sysv init without any problem. I created a systemd debug file but actuially I seem to be unable to see something useful inside to find out what process actually does take a long time to shut down and why. But since this problem only occurs on systemd using NFS and automount its probably related to these services. Thanks Rainer Reproducible: Always Steps to Reproduce: 1. Shut down or reboot the system (does not matter if you type reboot or click on the corresponding menu in KDE 2. 3. Actual Results: System shutdown starts, KDE desktop is terminated then after some seconds nothing seems to happen any more. Expected Results: A system shutdown in <10min :-) This problem is very annoying, since we use a multiboot (windows/linux) installtion on our desktops at the university. Since a linux shutdown does not work any longer people tend to turn the system power off wich sometimes leas to broken systems. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c1
--- Comment #1 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
zhang jiajun
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c2
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c3
Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c4
--- Comment #4 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c5
--- Comment #5 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c6
--- Comment #6 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c7
--- Comment #7 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c8
--- Comment #8 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c9
--- Comment #9 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c10
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c11
--- Comment #11 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c12
--- Comment #12 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c13
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c14
--- Comment #14 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c15
--- Comment #15 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c16
--- Comment #16 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c17
--- Comment #17 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c18
Dr. Werner Fink
### BEGIN INIT INFO # Provides: nfs # Required-Start: $network $portmap # Required-Stop: $network $portmap # X-Start-Before: +autofs # X-Start-Before: autofs
Please do not double the lines, use only *one* line
# Default-Start: 3 5 # Default-Stop: 0 1 2 6 # Short-Description: NFS client services # Description: All necessary services for NFS clients # X-Systemd-RemainAfterExit: true ### END INIT INFO
The `X-Start-Before' means start the script which includes this tag *before* the services mentioned after the tag. IMHO the manual page of the perl script insserv(1) should be explain this more in detail. The lines DESCRIPTION This version of insserv is just a stub for compatibility. does not explain any thing. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c19
--- Comment #19 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c20
--- Comment #20 from Andrey Borzenkov
The difference to the successful test you performed using a nfs entry in fstab and our setup is perhaps the higher complexity of our solution with more services involved. Our NFS mounts are done only via automounter, there are no NFS entries in fstab.
Yes. The problem is, if no filesystem that requires network is present in /etc/fstab, network-online.target IS NOT STARTED at all, so no synchronization point exists during shutdown. That is the weakest link here. The reasoning is - if nothing needs network, do not delay startup needlessly. But this contradicts "event based" systemd paradigm - we never know when network comes and goes. Currently if we unconditionally enable network-online.target this will pull in NetworkManager-wait-online.service for those that are using NM. By default it does not wait for anything, so it could be acceptable. Please try adding Wants: network-online.target to network@.service. This will ensure that when ifup is started it also starts network-online.target. NFS mounts should get dependency on it automatically. For ifup it has zero cost. You will need to restart network of course. But please consider that it is not strictly speaking systemd issue. Even assuming that all needed units are started - what happens when network *does* fail? There should be generic solution that covers situation of remote server being unavailable. But of course we need to fix the case of orderly shutdown. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c21
--- Comment #21 from Andrey Borzenkov
$ systemctl list-dependencies --after autofs The output shows to my knowledge which services are started *after* autofs has been started.
No, it is error in man page. It actually shows, after which units autofs is started. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c22
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c23
--- Comment #23 from Andrey Borzenkov
there is a warning but then nothing happens where I expect a emergency shell.
Root cause and suggested workaround: https://bugzilla.novell.com/show_bug.cgi?id=852021 Ping respective maintainer(s), you are more close to them :) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c24
--- Comment #24 from Andrey Borzenkov
Beside this ... ``not strictly speaking systemd issue'' ... I know that I've done a lot of shell code for SysVinit to avoid such NFS hangs at shutdown even if the network is down. Also it was possible to close any remote sessions before network was down. And both had worked. The question rises: does systemd has the possibility to avoid such hangs?
At least in 13.1 we still have the same script behind nfs.service. So probably there are conditions when it does not work? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c25
--- Comment #25 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c26
--- Comment #26 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c27
--- Comment #27 from Neil Brown
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c28
--- Comment #28 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c29
--- Comment #29 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c30
--- Comment #30 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c31
--- Comment #31 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c32
--- Comment #32 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c33
--- Comment #33 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c34
--- Comment #34 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c35
--- Comment #35 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c36
--- Comment #36 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c37
--- Comment #37 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c38
--- Comment #38 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c39
--- Comment #39 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c40
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Marius Tomaschewski
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c41
--- Comment #41 from Marius Tomaschewski
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c42
--- Comment #42 from Marius Tomaschewski
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c43
--- Comment #43 from Andrey Borzenkov
OK now I've added a patch
0001-add-network-device-after-NFS-mount-units.patch
which uses getifaddrs()/freeifaddrs() to parse the routing table for both IPv4 and IPv6 addresses found in the option string for NFS shares in /proc/mounts.
With this I add an "After=" dependency on
sys-subsystem-net-devices-<iface>.device
for
<nfs-share-mount-point>.mount
and indeed I've seen a wait status message at shutdown after a few reboots.
I'm afraid I do not see how it can have any effect at all. After/Before is relevant only between units that are part of the same start/stop transaction. But *.device units are never started (or stopped); they are passively waited for. So this patch is pure noop, and I of course get reproducible hanging system with it and without other dependencies discussed before. At the best this patch slightly changes relative timing; as we are facing race condition, it could make impression that it changed something. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c44
--- Comment #44 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c45
--- Comment #45 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c46
--- Comment #46 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c47
--- Comment #47 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c48
--- Comment #48 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c49
--- Comment #49 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c50
--- Comment #50 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c51
--- Comment #51 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c52
Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c53
--- Comment #53 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c54
--- Comment #54 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c55
--- Comment #55 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c56
--- Comment #56 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c57
--- Comment #57 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c57
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c58
--- Comment #58 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c59
--- Comment #59 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c61
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c62
--- Comment #62 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c63
Frederic Crozat
IMHO the insserv-generator may used After= and not Before=
I disagree. remote-fs-pre.target is the sync point which should be reached before network mount point are mounted. extract from systemd.special: This target unit is automatically ordered before all remote mount point units (see above). It can be used to run certain units before the remote mounts are established. Note that this unit is generally not part of the initial transaction, unless the unit that wants to be ordered before all remote mounts pulls it in via a Wants= type dependency. If the unit wants to be pulled in by the first remote mount showing up, it should use network-online.target (see above)." Therefore, if nfs or cifs mount points are declared, nfs / cifs "infrastructure" (ie services) should be up and running before those mount points are mounted by systemd, between "remote-fs-pre.target" and "remote-fs.target". (https://bugzilla.linux-nfs.org/show_bug.cgi?id=237 is kind of related) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c64
--- Comment #64 from Andrey Borzenkov
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c65
--- Comment #65 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c66
--- Comment #66 from Frederic Crozat
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c67
--- Comment #67 from Andrey Borzenkov
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c68
--- Comment #68 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c69
--- Comment #69 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c70
--- Comment #70 from Rainer Krienke
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c72
--- Comment #72 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=861489
https://bugzilla.novell.com/show_bug.cgi?id=861489#c
Ludwig Nussel
http://bugzilla.novell.com/show_bug.cgi?id=861489
Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=861489
http://bugzilla.novell.com/show_bug.cgi?id=861489#c74
--- Comment #74 from Rainer Krienke
you may give the last systemd from
http://download.opensuse.org/repositories/Base:/System/openSUSE_13.1/
a try as I've found and removes a nasty race between automounted home directories and the user services systemd-exit.service below
/usr/lib/systemd/user/systemd-exit.service
which tries to stop the systemd user manager process started at login for the session handling by using a chroot into the not existing user home before sending the signal to stop manager process
Thanks a lot Werner. In between we are only using openSuSE13.2 installations with NFS and automounter. In 13.2 the shutdown problem sometimes still shows up, but most of the time it simply works. Compared to 13.1 13.2 is much more reliable in this respect and in general. Rainer -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=861489
Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=861489
Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=861489
Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=861489
http://bugzilla.novell.com/show_bug.cgi?id=861489#c75
--- Comment #75 from Swamp Workflow Management
participants (1)
-
bugzilla_noreply@novell.com