[opensuse] systemd automount
Is there any way of finding out what prevents a (supposedly idle) automount from expiring? I have these 2-3 automounts - 2018-04-29T11:33:31+02:00 newton systemd[1]: srv-mysql.automount: Got automount request for /srv/mysql, triggered by 2347 (cmahostd) 2018-04-29T11:33:31+02:00 newton systemd[1]: srv-www.automount: Got automount request for /srv/www, triggered by 2347 (cmahostd) 2018-04-29T11:33:31+02:00 newton systemd[1]: proc-sys-fs-binfmt_misc.automount: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 2347 (cmahostd) I was expecting /srv/mysql and /srv/www to expire at some point, and I don't see either in lsof output. I don't really mind, but because those mount points are not yet in use, it made me curious. -- Per Jessen, Zürich (20.6°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am Sonntag, 29. April 2018, 19:51:20 CEST schrieb Per Jessen:
Is there any way of finding out what prevents a (supposedly idle) automount from expiring? [...]
What timeout did you specify? And where and how? Because it looks like "The timeout is disabled by default." Gruß Jan -- Negative expectations yield negative results, Positive expectations yield negative results. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Jan Ritzerfeld wrote:
Am Sonntag, 29. April 2018, 19:51:20 CEST schrieb Per Jessen:
Is there any way of finding out what prevents a (supposedly idle) automount from expiring? [...]
What timeout did you specify? And where and how? Because it looks like "The timeout is disabled by default."
Hi Jan this is my fstab: # cat /etc/fstab 10.42.8.254:/srv/nfs/newton/root / nfs hard,intr 0 0 UUID=1cc82c3d-1b32-4632-bdd4-8b56d30fff1e /srv/www jfs defaults,x-systemd.automount,x-systemd.idle-timeout=300,_netdev 0 2 UUID=cd525bc6-82b2-4f8e-af76-d321d1757e6c /srv/mysql jfs defaults,x-systemd.automount,x-systemd.idle-timeout=300,_netdev 0 2 After adding these two auto-mounts, the system also did not reboot properly. It got into a hang somewhere. -- Per Jessen, Zürich (19.4°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2018-04-29 20:55, Per Jessen wrote:
Jan Ritzerfeld wrote:
Am Sonntag, 29. April 2018, 19:51:20 CEST schrieb Per Jessen:
Is there any way of finding out what prevents a (supposedly idle) automount from expiring? [...]
What timeout did you specify? And where and how? Because it looks like "The timeout is disabled by default."
Hi Jan
this is my fstab:
# cat /etc/fstab 10.42.8.254:/srv/nfs/newton/root / nfs hard,intr 0 0
This is a fixed nfs mount. But "_netdev" is missing.
UUID=1cc82c3d-1b32-4632-bdd4-8b56d30fff1e /srv/www jfs defaults,x-systemd.automount,x-systemd.idle-timeout=300,_netdev 0 2 UUID=cd525bc6-82b2-4f8e-af76-d321d1757e6c /srv/mysql jfs defaults,x-systemd.automount,x-systemd.idle-timeout=300,_netdev 0 2
After adding these two auto-mounts, the system also did not reboot properly. It got into a hang somewhere.
Maybe network got killed before umounting. It happens with systemd. I don't understand those two lines. Are they NFS mounts, using UUID? I have never seen that. Seems a local automount, but then there is "_netdev". I use: telcontar.valinor:/data/storage_c/repositorios_zypp/ /var/cache/zypp/nfs_packages nfs defaults,noauto,nofail,x-systemd.automount,x-systemd.idle-timeout=300,_netdev,nfsvers=4 0 0 Why I have "noauto,nofail" on an automount I don't remember, but it works. Means don't mount on boot, don't panic if it does not mount on boot. What happens if doesn't mount when needed? I'm not sure. -- Cheers / Saludos, Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar)
30.04.2018 00:47, Carlos E. R. пишет:
On 2018-04-29 20:55, Per Jessen wrote:
Jan Ritzerfeld wrote:
Am Sonntag, 29. April 2018, 19:51:20 CEST schrieb Per Jessen:
Is there any way of finding out what prevents a (supposedly idle) automount from expiring? [...]
What timeout did you specify? And where and how? Because it looks like "The timeout is disabled by default."
Hi Jan
this is my fstab:
# cat /etc/fstab 10.42.8.254:/srv/nfs/newton/root / nfs hard,intr 0 0
This is a fixed nfs mount. But "_netdev" is missing.
_netdev is implicit for NFS mounts
On 2018-04-30 05:55, Andrei Borzenkov wrote:
30.04.2018 00:47, Carlos E. R. пишет:
On 2018-04-29 20:55, Per Jessen wrote:
Jan Ritzerfeld wrote:
Am Sonntag, 29. April 2018, 19:51:20 CEST schrieb Per Jessen:
Is there any way of finding out what prevents a (supposedly idle) automount from expiring? [...]
What timeout did you specify? And where and how? Because it looks like "The timeout is disabled by default."
Hi Jan
this is my fstab:
# cat /etc/fstab 10.42.8.254:/srv/nfs/newton/root / nfs hard,intr 0 0
This is a fixed nfs mount. But "_netdev" is missing.
_netdev is implicit for NFS mounts
That's good to know, thanks :-) Was it always so? Because I have seen _netdev recommended more than once for ntfs and samba shares. -- Cheers / Saludos, Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar)
Carlos E. R. wrote:
On 2018-04-29 20:55, Per Jessen wrote:
Jan Ritzerfeld wrote:
Am Sonntag, 29. April 2018, 19:51:20 CEST schrieb Per Jessen:
Is there any way of finding out what prevents a (supposedly idle) automount from expiring? [...]
What timeout did you specify? And where and how? Because it looks like "The timeout is disabled by default."
Hi Jan
this is my fstab:
# cat /etc/fstab 10.42.8.254:/srv/nfs/newton/root / nfs hard,intr 0 0
This is a fixed nfs mount. But "_netdev" is missing.
I don't think it's needed, the file system is root, mounted by the initrd on start-up.
UUID=1cc82c3d-1b32-4632-bdd4-8b56d30fff1e /srv/www jfs defaults,x-systemd.automount,x-systemd.idle-timeout=300,_netdev 0 2 UUID=cd525bc6-82b2-4f8e-af76-d321d1757e6c /srv/mysql jfs defaults,x-systemd.automount,x-systemd.idle-timeout=300,_netdev 0 2
After adding these two auto-mounts, the system also did not reboot properly. It got into a hang somewhere.
Maybe network got killed before umounting. It happens with systemd.
As long as the interface is defined as "nfsroot", it works very well. I think I may have written a bugreport on something about wicked and nfsroot, but a long time ago.
I don't understand those two lines. Are they NFS mounts, using UUID? I have never seen that. Seems a local automount, but then there is "_netdev".
They are not nfs, they are multipathed iSCSI mounts. They do essentially appear as local scsi drives: # lsscsi [0:3:0:0] storage HP P400 7.22 - [1:0:0:0] cd/dvd HL-DT-ST RW/DVD GCC-C10N 2.00 /dev/sr0 [3:0:0:0] disk IET VIRTUAL-DISK 0 /dev/sda [3:0:0:1] disk IET VIRTUAL-DISK 0 /dev/sdb -- Per Jessen, Zürich (11.4°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. wrote:
I use:
telcontar.valinor:/data/storage_c/repositorios_zypp/ /var/cache/zypp/nfs_packages nfs
defaults,noauto,nofail,x-systemd.automount,x-systemd.idle-timeout=300,_netdev,nfsvers=4
0 0
Why I have "noauto,nofail" on an automount I don't remember, but it works. Means don't mount on boot, don't panic if it does not mount on
IIRC, "noauto" is for excluding the mount from "mount -a" - which I don't systemd does on boot anyway. -- Per Jessen, Zürich (11.7°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Carlos E. R. wrote:
I use:
telcontar.valinor:/data/storage_c/repositorios_zypp/ /var/cache/zypp/nfs_packages nfs
defaults,noauto,nofail,x-systemd.automount,x-systemd.idle-timeout=300,_netdev,nfsvers=4
0 0
Why I have "noauto,nofail" on an automount I don't remember, but it works. Means don't mount on boot, don't panic if it does not mount on
IIRC, "noauto" is for excluding the mount from "mount -a" - which I don't systemd does on boot anyway.
I looked it up. systemd will nonetheless consider the noauto flag and omit those mounts from the start-up process. -- Per Jessen, Zürich (13.7°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am Sonntag, 29. April 2018, 20:55:07 CEST schrieb Per Jessen:
[...] Hi Jan
Hi Per, too bad, I'm really not familiar with systemd "diskless" systems. And you obviously specified a timeout for /srv/mysql and /srv/www. However, if the cmahostd Agent periodically accesses those directories, they will never get unmounted. And according to https://support.hpe.com/hpsc/doc/public/display?docId=mmr_kc-0131418 it actually does.
[...] After adding these two auto-mounts, the system also did not reboot properly. It got into a hang somewhere.
In case fixing the timeout problem won't help: If you don't use automount for them, does the system still hang? Is there a mysqld running on /srv/mysql? There also might be a problem with iSCSI multipath. Or a race condition when shutting down the network iscsid uses... Gruß Jan -- The future isn't what it used to be. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Jan Ritzerfeld wrote:
Am Sonntag, 29. April 2018, 20:55:07 CEST schrieb Per Jessen:
[...] Hi Jan
Hi Per,
too bad, I'm really not familiar with systemd "diskless" systems. And you obviously specified a timeout for /srv/mysql and /srv/www. However, if the cmahostd Agent periodically accesses those directories, they will never get unmounted. And according to https://support.hpe.com/hpsc/doc/public/display?docId=mmr_kc-0131418 it actually does.
Yeah, I just found out too. I unmounted both, only to see them being remounted by cmahostd a little while later :-) So, that's one question answered. They are actually meant to be mounted permanently, I just thought systemd automount would be a neat thing to do.
[...] After adding these two auto-mounts, the system also did not reboot properly. It got into a hang somewhere.
In case fixing the timeout problem won't help: If you don't use automount for them, does the system still hang? Is there a mysqld running on /srv/mysql? There also might be a problem with iSCSI multipath. Or a race condition when shutting down the network iscsid uses...
Yes, this issue is more worrying. The hang is clearly related to those drives being mounted - I have just now rebooted that system, making sure they were unmounted first. No problems. There is a mysqlk running on /srv/mysql and there will be and apache running on /srv/www soon. I think I'll try without automount. -- Per Jessen, Zürich (13.0°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Jan Ritzerfeld wrote:
Am Sonntag, 29. April 2018, 20:55:07 CEST schrieb Per Jessen:
[...] After adding these two auto-mounts, the system also did not reboot properly. It got into a hang somewhere.
In case fixing the timeout problem won't help: If you don't use automount for them, does the system still hang? Is there a mysqld running on /srv/mysql? There also might be a problem with iSCSI multipath. Or a race condition when shutting down the network iscsid uses...
It looks like there's a problem with multipath - the network interface is defined as 'nfsroot', that works fine. Before I go and hook up a serial console, I was hoping to find the shutdown process in the log (/var/log/messages). All I get is "syslog-ng[1083]: syslog-ng shutting down; version='3.14.1'" - any chance syslog-ng is being shut down too soon? -- Per Jessen, Zürich (13.5°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am Montag, 30. April 2018, 11:53:06 CEST schrieb Per Jessen:
[...] It looks like there's a problem with multipath - the network interface is defined as 'nfsroot', that works fine. Before I go and hook up a serial console, I was hoping to find the shutdown process in the log (/var/log/messages). All I get is "syslog-ng[1083]: syslog-ng shutting down; version='3.14.1'" - any chance syslog-ng is being shut down too soon?
Hmm, I don't use syslog. However, to debug shutdown problems and get the shutdown log: https://freedesktop.org/wiki/Software/systemd/Debugging/#index2h1 Since the shutdown never finishes, use the debug shell that stays active until late shutdown. See "Early Debug Shell" beneath https://freedesktop.org/wiki/Software/systemd/Debugging/#index1h1 Gruß Jan -- Apathy is the worlds fastest growing disease. But who cares? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Jan Ritzerfeld wrote:
Am Montag, 30. April 2018, 11:53:06 CEST schrieb Per Jessen:
[...] It looks like there's a problem with multipath - the network interface is defined as 'nfsroot', that works fine. Before I go and hook up a serial console, I was hoping to find the shutdown process in the log (/var/log/messages). All I get is "syslog-ng[1083]: syslog-ng shutting down; version='3.14.1'" - any chance syslog-ng is being shut down too soon?
Hmm, I don't use syslog. However, to debug shutdown problems and get the shutdown log: https://freedesktop.org/wiki/Software/systemd/Debugging/#index2h1 Since the shutdown never finishes, use the debug shell that stays active until late shutdown. See "Early Debug Shell" beneath https://freedesktop.org/wiki/Software/systemd/Debugging/#index1h1
It looks like iscsid is being shut down before the mounts have completed unmount. srv-www.mount: About to execute: /usr/bin/umount /srv/www -c srv-www.mount: Forked /usr/bin/umount as 4620 srv-www.mount: Changed mounted -> unmounting iscsi.service: About to execute: /sbin/iscsiadm -m node --logoutall=automatic iscsi.service: Forked /sbin/iscsiadm as 4623 iscsi.service: Changed exited -> stop srv-www.mount: Executing: /usr/bin/umount /srv/www -c iscsi.service: Executing: /sbin/iscsiadm -m node --logoutall=automatic Received SIGCHLD from PID 4623 (iscsiadm). Child 4623 (iscsiadm) died (code=exited, status=0/SUCCESS) iscsi.service: Child 4623 belongs to iscsi.service iscsi.service: Control process exited, code=exited status=0 iscsi.service: Running next control command for state stop. iscsi.service: About to execute: /sbin/iscsiadm -m node --logoutall=manual iscsi.service: Forked /sbin/iscsiadm as 4672 iscsi.service: Executing: /sbin/iscsiadm -m node --logoutall=manual At this point, srv-www has not yet completed unmount. iscsi.service: Job iscsi.service/stop finished, result=done srv-www.mount: Unmounting timed out. Stopping. srv-www.mount: Changed unmounting -> unmounting-sigterm srv-www.mount: Unmounting timed out. Killing. srv-www.mount: Killing process 4620 (umount) with signal SIGKILL. srv-www.mount: Changed unmounting-sigterm -> unmounting-sigkill -- Per Jessen, Zürich (8.9°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
It looks like iscsid is being shut down before the mounts have completed unmount.
I've now added the following to both iscsi mounts - [Unit] After=iscsi.service iscsid.service (as a drop-in override). With that, the shutdown works fine. It seems to me this ought to work automagically, but I'm not sure of the mechanism involved. -- Per Jessen, Zürich (9.0°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am Dienstag, 1. Mai 2018, 11:04:22 CEST schrieb Per Jessen:
[...] It looks like iscsid is being shut down before the mounts have completed unmount.
But mysqld is already stopped?
[...] At this point, srv-www has not yet completed unmount.
That's interesting. You've already added the _netdev mount option. Does "systemctl show -p Before -p After" on your systemd-fstab-generated mount unit show Before=remote-fs.target? And "systemctl status remote-fs.target remote- fs-pre.target" shows those targets as active? The iscsi.service has Before=remote-fs.target, too. I don't understand these dependencies because how could this work? Since your iSCSI mounts are automount, booting will be fine. But your iscsi.service could be stopped and your iSCSI mounts unmounted in parallel because they both are Before=remote- fs.target. This somehow remembers me of https://lists.opensuse.org/opensuse/2014-03/msg00447.html Maybe Andrey can help again? :) Gruß Jan -- It's fascinating how memory diffuses fact. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Jan Ritzerfeld wrote:
Am Dienstag, 1. Mai 2018, 11:04:22 CEST schrieb Per Jessen:
[...] It looks like iscsid is being shut down before the mounts have completed unmount.
But mysqld is already stopped?
Most of the time, but not always.
[...] At this point, srv-www has not yet completed unmount.
That's interesting. You've already added the _netdev mount option. Does "systemctl show -p Before -p After" on your systemd-fstab-generated mount unit show Before=remote-fs.target?
(this is with my extra drop-ins) # systemctl show -p Before -p After srv-www.mount Before=remote-fs.target apache2.service umount.target After=dev-mapper-14945540000000000776562686f7374322e30000000000000.device systemd-fsck@dev-disk-by\x2duuid-1cc82c3d\x2d1b32\x2d4632\x2dbdd4\x2d8b56d30fff1e.service iscsi.service iscsid.service -.mount dev-disk-by\x2duuid-1cc82c3d\x2d1b32\x2d4632\x2dbdd4\x2d8b56d30fff1e.device network-online.target remote-fs-pre.target system.slice network.target # systemctl show -p Before -p After srv-mysql.mount Before=remote-fs.target mariadb.service umount.target After=-.mount iscsid.service system.slice iscsi.service dev-mapper-1494554000000000064617461686f7374322e300000000000.device dev-disk-by\x2duuid-cd525bc6\x2d82b2\x2d4f8e\x2daf76\x2dd321d1757e6c.device network.target remote-fs-pre.target systemd-fsck@dev-disk-by\x2duuid-cd525bc6\x2d82b2\x2d4f8e\x2daf76\x2dd321d1757e6c.service network-online.target
And "systemctl status remote-fs.target remote- fs-pre.target" shows those targets as active?
Yep, both are good.
The iscsi.service has Before=remote-fs.target, too. I don't understand these dependencies because how could this work? Since your iSCSI mounts are automount, booting will be fine.
I have reverted to regular fixed mounts, to remove that variable.
But your iscsi.service could be stopped and your iSCSI mounts unmounted in parallel because they both are Before=remote-fs.target.
That would be a problem.
This somehow remembers me of https://lists.opensuse.org/opensuse/2014-03/msg00447.html Maybe Andrey can help again? :)
What a memory :-) Certainly sounds the same. I think I know which system it must have been - most likely it is openSUSE 12.3. It's still the same, and I don't know of any shutdown problems. That one does not have multipath yet, and the iscsi file systems are mounted by-id, not by UUID. -- Per Jessen, Zürich (12.6°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am Dienstag, 1. Mai 2018, 17:45:11 CEST schrieb Per Jessen:
Jan Ritzerfeld wrote:
Am Dienstag, 1. Mai 2018, 11:04:22 CEST schrieb Per Jessen:
[...] It looks like iscsid is being shut down before the mounts have completed unmount.
But mysqld is already stopped?
Most of the time, but not always.
And stopping mysqld doesn't help shutting down the system?
[...] At this point, srv-www has not yet completed unmount.
That's interesting. You've already added the _netdev mount option. Does "systemctl show -p Before -p After" on your systemd-fstab-generated mount unit show Before=remote-fs.target?
(this is with my extra drop-ins) [...]
Well, your drop-ins are effective! Now srv-www.mount and srv-mysql.mount have After=...iscsi.service iscsid.service... and thus iscsi.service is stopped after unmounting your iSCSI mounts. It would be interesting to see if the output without your drop-ins will be only missing After=...iscsi.service iscsid.service....
And "systemctl status remote-fs.target remote- fs-pre.target" shows those targets as active?
Yep, both are good.
Okay.
The iscsi.service has Before=remote-fs.target, too. I don't understand these dependencies because how could this work? Since your iSCSI mounts are automount, booting will be fine.
I have reverted to regular fixed mounts, to remove that variable.
Perfect.
But your iscsi.service could be stopped and your iSCSI mounts unmounted in parallel because they both are Before=remote-fs.target.
That would be a problem.
If we don't find an answer why iscsi.service's Before is as it is, you should file an opensuse bug report. AFAICS the iscsi.service file is not upstream.
This somehow remembers me of https://lists.opensuse.org/opensuse/2014-03/msg00447.html Maybe Andrey can help again? :)
What a memory :-) Certainly sounds the same. I think I know which system it must have been - most likely it is openSUSE 12.3. It's still the same, and I don't know of any shutdown problems.
Since both mounts have After=...remote-fs-pre.target..., overriding iscsi.service to Before=remote-fs-pre.target instead of remote-fs.target could help with this issue. iscsi.service should be still stopped before iscsid.service because of its After=...iscsi.service... However, this may break other dependencies I didn't think about yet.
That one does not have multipath yet, and the iscsi file systems are mounted by-id, not by UUID.
I hope the multipath problems I found using Google were already solved. Gruß Jan -- I can understand that the Universe is unfair, but why isn't ever unfair in my favor? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Jan Ritzerfeld wrote:
Am Dienstag, 1. Mai 2018, 17:45:11 CEST schrieb Per Jessen:
Jan Ritzerfeld wrote:
Am Dienstag, 1. Mai 2018, 11:04:22 CEST schrieb Per Jessen:
[...] It looks like iscsid is being shut down before the mounts have completed unmount.
But mysqld is already stopped?
Most of the time, but not always.
And stopping mysqld doesn't help shutting down the system?
No, only if I also unmounted /srv/mysql.
[...] At this point, srv-www has not yet completed unmount.
That's interesting. You've already added the _netdev mount option. Does "systemctl show -p Before -p After" on your systemd-fstab-generated mount unit show Before=remote-fs.target?
(this is with my extra drop-ins) [...]
Well, your drop-ins are effective! Now srv-www.mount and srv-mysql.mount have After=...iscsi.service iscsid.service... and thus iscsi.service is stopped after unmounting your iSCSI mounts. It would be interesting to see if the output without your drop-ins will be only missing After=...iscsi.service iscsid.service....
I was really close to posting them earlier :-) Having edited my drop-ins, commented out After=, and systemctl daemon-reload: # systemctl show -p Before -p After srv-mysql.mount Before=mariadb.service remote-fs.target umount.target After=system.slice network-online.target dev-disk-by\x2duuid-cd525bc6\x2d82b2\x2d4f8e\x2daf76\x2dd321d1757e6c.device -.mount systemd-fsck@dev-disk-by\x2duuid-cd525bc6\x2d82b2\x2d4f8e\x2daf76\x2dd321d1757e6c.service dev-mapper-1494554000000000064617461686f7374322e300000000000.device network.target remote-fs-pre.target # systemctl show -p Before -p After srv-www.mount Before=apache2.service remote-fs.target umount.target After=-.mount dev-mapper-14945540000000000776562686f7374322e30000000000000.device systemd-fsck@dev-disk-by\x2duuid-1cc82c3d\x2d1b32\x2d4632\x2dbdd4\x2d8b56d30fff1e.service remote-fs-pre.target dev-disk-by\x2duuid-1cc82c3d\x2d1b32\x2d4632\x2dbdd4\x2d8b56d30fff1e.device network.target system.slice network-online.target
But your iscsi.service could be stopped and your iSCSI mounts unmounted in parallel because they both are Before=remote-fs.target.
That would be a problem.
If we don't find an answer why iscsi.service's Before is as it is, you should file an opensuse bug report. AFAICS the iscsi.service file is not upstream.
Yeah, I already have :-) http://bugzilla.opensuse.org/show_bug.cgi?id=1091517
This somehow remembers me of https://lists.opensuse.org/opensuse/2014-03/msg00447.html Maybe Andrey can help again? :)
What a memory :-) Certainly sounds the same. I think I know which system it must have been - most likely it is openSUSE 12.3. It's still the same, and I don't know of any shutdown problems.
Since both mounts have After=...remote-fs-pre.target..., overriding iscsi.service to Before=remote-fs-pre.target instead of remote-fs.target could help with this issue.
I'll try that.
iscsi.service should be still stopped before iscsid.service because of its After=...iscsi.service... However, this may break other dependencies I didn't think about yet.
That one does not have multipath yet, and the iscsi file systems are mounted by-id, not by UUID.
I hope the multipath problems I found using Google were already solved.
AFAICT, multipath isn't causing any problems here. My gut feeling says it's a race condition - this machine is not the latest and greatest, but has more cores and is faster than the other webhosts. (with similar iscsi configs). Thanks for your help Jan, an extra pair of eyes is always useful! -- Per Jessen, Zürich (11.5°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am Dienstag, 1. Mai 2018, 20:28:43 CEST schrieb Per Jessen:
Jan Ritzerfeld wrote:
Am Dienstag, 1. Mai 2018, 17:45:11 CEST schrieb Per Jessen:
Jan Ritzerfeld wrote:
Am Dienstag, 1. Mai 2018, 11:04:22 CEST schrieb Per Jessen:
[...] It looks like iscsid is being shut down before the mounts have completed unmount.
But mysqld is already stopped?
Most of the time, but not always.
And stopping mysqld doesn't help shutting down the system?
No, only if I also unmounted /srv/mysql.
Good. Just to make sure it's not caused by the mysqld preventing earlier unmounting.
[...] At this point, srv-www has not yet completed unmount.
That's interesting. You've already added the _netdev mount option. Does "systemctl show -p Before -p After" on your systemd-fstab-generated mount unit show Before=remote-fs.target?
(this is with my extra drop-ins) [...]
Well, your drop-ins are effective! Now srv-www.mount and srv-mysql.mount have After=...iscsi.service iscsid.service... and thus iscsi.service is stopped after unmounting your iSCSI mounts. It would be interesting to see if the output without your drop-ins will be only missing After=...iscsi.service iscsid.service....
I was really close to posting them earlier :-)
Documenting working examples by posting them here is always a good idea, too!
Having edited my drop-ins, commented out After=, and systemctl daemon-reload: [...]
Okay. Besides some ordering within Before and After, the are only missing After=...iscsi.service iscsid.service....
But your iscsi.service could be stopped and your iSCSI mounts unmounted in parallel because they both are Before=remote-fs.target.
That would be a problem.
If we don't find an answer why iscsi.service's Before is as it is, you should file an opensuse bug report. AFAICS the iscsi.service file is not upstream.
Yeah, I already have :-)
Perfect.
http://bugzilla.opensuse.org/show_bug.cgi?id=1091517 [...]
Since both mounts have After=...remote-fs-pre.target..., overriding iscsi.service to Before=remote-fs-pre.target instead of remote-fs.target could help with this issue.
I'll try that.
Regardless whether Andrei's Before=remote-fs-pre.target works or not, adding this outcome to your bug report would be helpful.
iscsi.service should be still stopped before iscsid.service because of its After=...iscsi.service... However, this may break other dependencies I didn't think about yet.
That one does not have multipath yet, and the iscsi file systems are mounted by-id, not by UUID.
I hope the multipath problems I found using Google were already solved.
AFAICT, multipath isn't causing any problems here.
Yes, otherwise your After=iscsi.service iscsid.service wouldn't help much.
My gut feeling says it's a race condition - this machine is not the latest and greatest, but has more cores and is faster than the other webhosts. (with similar iscsi configs).
Ah, okay. Maybe this is the cause no one else reported this problem before.
Thanks for your help Jan, an extra pair of eyes is always useful!
You're welcome! I'm happy to have learned more about systemd dependencies. Gruß Jan -- If it can't be expressed in figures, it is not science; it's opinion. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
01.05.2018 12:04, Per Jessen пишет:
Jan Ritzerfeld wrote:
Am Montag, 30. April 2018, 11:53:06 CEST schrieb Per Jessen:
[...] It looks like there's a problem with multipath - the network interface is defined as 'nfsroot', that works fine. Before I go and hook up a serial console, I was hoping to find the shutdown process in the log (/var/log/messages). All I get is "syslog-ng[1083]: syslog-ng shutting down; version='3.14.1'" - any chance syslog-ng is being shut down too soon?
Hmm, I don't use syslog. However, to debug shutdown problems and get the shutdown log: https://freedesktop.org/wiki/Software/systemd/Debugging/#index2h1 Since the shutdown never finishes, use the debug shell that stays active until late shutdown. See "Early Debug Shell" beneath https://freedesktop.org/wiki/Software/systemd/Debugging/#index1h1
It looks like iscsid is being shut down before the mounts have completed unmount.
Is it with your drop-ins? Can you generate full log from boot to shutdown using systemd.log_level=debug on kernel command line? You can additionally try something like log_buf_len=64M printk.devkmsg=on systemd.log_target=kmsg The first sets kernel log buffer size, the second disables rate limiting for user space and last one makes systemd put everything in kmsg. Then on shutdown you should be able to simply save output of dmesg to get complete log. You may need to play with log_buf_len to make sure it is large enough.
srv-www.mount: About to execute: /usr/bin/umount /srv/www -c srv-www.mount: Forked /usr/bin/umount as 4620 srv-www.mount: Changed mounted -> unmounting iscsi.service: About to execute: /sbin/iscsiadm -m node --logoutall=automatic iscsi.service: Forked /sbin/iscsiadm as 4623 iscsi.service: Changed exited -> stop srv-www.mount: Executing: /usr/bin/umount /srv/www -c iscsi.service: Executing: /sbin/iscsiadm -m node --logoutall=automatic Received SIGCHLD from PID 4623 (iscsiadm). Child 4623 (iscsiadm) died (code=exited, status=0/SUCCESS) iscsi.service: Child 4623 belongs to iscsi.service iscsi.service: Control process exited, code=exited status=0 iscsi.service: Running next control command for state stop. iscsi.service: About to execute: /sbin/iscsiadm -m node --logoutall=manual iscsi.service: Forked /sbin/iscsiadm as 4672 iscsi.service: Executing: /sbin/iscsiadm -m node --logoutall=manual
At this point, srv-www has not yet completed unmount.
iscsi.service: Job iscsi.service/stop finished, result=done srv-www.mount: Unmounting timed out. Stopping. srv-www.mount: Changed unmounting -> unmounting-sigterm srv-www.mount: Unmounting timed out. Killing. srv-www.mount: Killing process 4620 (umount) with signal SIGKILL. srv-www.mount: Changed unmounting-sigterm -> unmounting-sigkill
Well, your other mail says you have explicit After=iscsi.service, so the only reason for this I can think of is dependency loop. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
01.05.2018 12:04, Per Jessen пишет:
Jan Ritzerfeld wrote:
Am Montag, 30. April 2018, 11:53:06 CEST schrieb Per Jessen:
[...] It looks like there's a problem with multipath - the network interface is defined as 'nfsroot', that works fine. Before I go and hook up a serial console, I was hoping to find the shutdown process in the log (/var/log/messages). All I get is "syslog-ng[1083]: syslog-ng shutting down; version='3.14.1'" - any chance syslog-ng is being shut down too soon?
Hmm, I don't use syslog. However, to debug shutdown problems and get the shutdown log: https://freedesktop.org/wiki/Software/systemd/Debugging/#index2h1 Since the shutdown never finishes, use the debug shell that stays active until late shutdown. See "Early Debug Shell" beneath https://freedesktop.org/wiki/Software/systemd/Debugging/#index1h1
It looks like iscsid is being shut down before the mounts have completed unmount.
Is it with your drop-ins?
Hi Andrei, No, this log was without the drop-ins.
Can you generate full log from boot to shutdown using systemd.log_level=debug on kernel command line? You can additionally try something like
log_buf_len=64M printk.devkmsg=on systemd.log_target=kmsg
Okay, I'll try that.
The first sets kernel log buffer size, the second disables rate limiting for user space and last one makes systemd put everything in kmsg. Then on shutdown you should be able to simply save output of dmesg to get complete log.
Will a serial console capture be good enough? [snip]
At this point, srv-www has not yet completed unmount.
iscsi.service: Job iscsi.service/stop finished, result=done srv-www.mount: Unmounting timed out. Stopping. srv-www.mount: Changed unmounting -> unmounting-sigterm srv-www.mount: Unmounting timed out. Killing. srv-www.mount: Killing process 4620 (umount) with signal SIGKILL. srv-www.mount: Changed unmounting-sigterm -> unmounting-sigkill
Well, your other mail says you have explicit After=iscsi.service, so the only reason for this I can think of is dependency loop.
The above was without the drop-in (with explicit After=iscsi.service) - once I added the drop-ins, things are working fine. -- Per Jessen, Zürich (10.8°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
01.05.2018 21:55, Per Jessen пишет:
Well, your other mail says you have explicit After=iscsi.service, so the only reason for this I can think of is dependency loop.
The above was without the drop-in (with explicit After=iscsi.service) - once I added the drop-ins, things are working fine.
Then there is no point in log. Without explicit ordering between two units they are stopped concurrently, there is nothing new here. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
01.05.2018 21:55, Per Jessen пишет:
Well, your other mail says you have explicit After=iscsi.service, so the only reason for this I can think of is dependency loop.
The above was without the drop-in (with explicit After=iscsi.service) - once I added the drop-ins, things are working fine.
Then there is no point in log. Without explicit ordering between two units they are stopped concurrently, there is nothing new here.
Okay - is there something missing or is it expected that iSCSI mounts will need those drop-ins? It just seems like it ought to be automagic ? -- Per Jessen, Zürich (9.8°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
02.05.2018 08:40, Per Jessen пишет:
Andrei Borzenkov wrote:
01.05.2018 21:55, Per Jessen пишет:
Well, your other mail says you have explicit After=iscsi.service, so the only reason for this I can think of is dependency loop.
The above was without the drop-in (with explicit After=iscsi.service) - once I added the drop-ins, things are working fine.
Then there is no point in log. Without explicit ordering between two units they are stopped concurrently, there is nothing new here.
Okay - is there something missing or is it expected that iSCSI mounts will need those drop-ins? It just seems like it ought to be automagic ?
There is currently no way to auto-disscover all those dependencies. This comes up every now and then on systemd list and I believe there is also issue filed for it, but nothing so far. Bug hammer is to order iSCSI before remote-fs-pre.target and ensure filesystems have _netdev. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Andrei Borzenkov wrote:
02.05.2018 08:40, Per Jessen пишет:
Andrei Borzenkov wrote:
01.05.2018 21:55, Per Jessen пишет:
Well, your other mail says you have explicit After=iscsi.service, so the only reason for this I can think of is dependency loop.
The above was without the drop-in (with explicit After=iscsi.service) - once I added the drop-ins, things are working fine.
Then there is no point in log. Without explicit ordering between two units they are stopped concurrently, there is nothing new here.
Okay - is there something missing or is it expected that iSCSI mounts will need those drop-ins? It just seems like it ought to be automagic ?
There is currently no way to auto-disscover all those dependencies. This comes up every now and then on systemd list and I believe there is also issue filed for it, but nothing so far.
Bug hammer is to order iSCSI before remote-fs-pre.target and ensure filesystems have _netdev.
Thanks Andrei! I _was_ wondering about the feasibility, but doing it myself is obviously straight forward. -- Per Jessen, Zürich (10.1°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (4)
-
Andrei Borzenkov
-
Carlos E. R.
-
Jan Ritzerfeld
-
Per Jessen