Continuing this thread from a while back. I was not able to mess with the system that exhibited the odd umounting behavior. I have come to the conclusion that systemd gets confused about the relationship between /dev/disk/by-path and the /dev/sdX names. I can verify that the kernel is not the source of the confusion. The links between these devices are always correct. However, systemd seems to sometimes remember the old links after a disk has been removed. So mounting a disk again usually does the wrong thing. The only way to get systemd sorted is to reboot. I cannot find a way to get systemd to forget/update it's information for a disk. Even though it has been unmounted, and the current link between /dev/disk/by-path and the /dev/sdX names is correct, systemd uses the old /devsdX when doing the mount. To be clear, I do not use the /dev/sdX names as those are variable. All communication to systemd is with the /dev/disk/by-path names. It is systemd that is converting these to the /dev/dsX name and somehow maintaining that information somewhere. I have tried, after umounting some disks, running systemctl daemon-reload and systemctl daemon-reexec. This makes no difference. Is there a way to get systemd to rethink this internal information? It is not feasible that I need to reboot when systemd is so confused. I'm desperate! On Fri, Sep 4, 2020 at 12:56 PM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
On Fri, Sep 4, 2020 at 10:31 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
On Fri, Sep 4, 2020 at 11:10 AM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
On Fri, Sep 4, 2020 at 7:41 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
I see in /tmp/damount.log that the by-path is pointing to a new sdx1 value. But it is some other sdx1 that is mounted.
Using systemd-mount from within udev RUN rule creates race condition - systemd-mount tells systemd to mount this device, but this device becomes known to systemd only after RUN commands has been executed.
I guess. That is why I listed where the by-path was pointing when the RUN command was executing. I suspected that it was not correct yet when the RUN command was executing. But I saw that it was in fact correct.
So if the by-path entry points to the correct sdX1 when the RUN command runs, what else could systemd-mount or it's minions be looking at? Some other information other than the current definition of the device as it is in /dev?
Actually I think I was wrong. systemd-mount passes all parameters directly to systemd which creates a mount unit and queues start job for it. By default mount unit has dependency on its device so start job will wait until device becomes known to systemd.
But it also means actual mount happens later. Are you absolutely sure these path names are unique and persistent at every point in time?
Yes. That was my original thought. It is a SuperMicro computer with a 24 disk array.
We have used udevadmin to see what things happen when a disk is inserted. We recorded the by-path value for activity on each drive bay. Those were used to make the udev rules like the one in the original message. There are 24 rules, one for each drive bay. In our RUN script, we print the by_path value when we insert a disk. It is always the expected one for a specific physical bay.
Of course, the sdX that this points to changes each time, depending on which names are available.
I should add that we explicitly umount a disk before it is to be removed. One cannot have that in a udev rule... So there is nothing in our udev rules related to a remove action. I'm not sure what we could put there.
You may consider using SYSTEMD_WANTS in your rule instead which points to service template; pass device name as parameter.
I do not know anything about this. I will explore. Thanks for the pointer.
No, I do not think it will change anything.
I do not say it is necessary the reason for your problem, but as you also did not show any real information there is nothing that would allow to guess.
I do not know what other information to provide. I tried to be complete. If you can suggest some other information I will be happy to provide it.
Which is why I usually ask customers to generate supportconfig - because I also do not know in advance what may be relevant :)
More serious - show actual mount units, mounted filesystems, device links and udev database before you insert disk. Enable systemd debug (see man systemd, SIGRTMIN+23 IIRC). Insert disk. Collect the same information again to show the difference. Collect full journalctl output (journalctl -b) that includes debug information. Tells what exact disk you have inserted to give something to look for.
As an example, this is the current state:
Filesystem Size Used Avail Use% Mounted on devtmpfs 47G 0 47G 0% /dev tmpfs 47G 184K 47G 1% /dev/shm tmpfs 47G 19M 47G 1% /run tmpfs 47G 0 47G 0% /sys/fs/cgroup /dev/sdb2 217G 173G 34G 84% / /dev/sdt1 1.8T 22G 1.8T 2% /array/d6 /dev/sdk1 1.8T 1.4T 452G 76% /array/b3 /dev/sdv1 1.8T 1.4T 417G 78% /array/d3 /dev/sdg1 1.8T 1020G 813G 56% /array/a3 /dev/sdq1 1.8T 1.1T 750G 60% /array/c3 /dev/sdd1 1.8T 1.5T 372G 80% /array/a6 /dev/sdp1 1.8T 1.1T 749G 60% /array/c4 /dev/sdx1 917G 158G 759G 18% /array/d1 /dev/sdu1 1.8T 1.2T 611G 67% /array/d4 /dev/sds1 917G 181G 735G 20% /array/c1 /dev/sdj1 1.8T 1.4T 439G 77% /array/b4 /dev/sdm1 917G 160G 757G 18% /array/b1 /dev/sdi1 917G 163G 754G 18% /array/a1 /dev/sdr1 1.9T 1.1T 747G 60% /array/c2 /dev/sdo1 1.9T 17G 1.9T 1% /array/c5 /dev/sdw1 1.9T 1.3T 601G 68% /array/d2 /dev/sdf1 1.8T 965G 867G 53% /array/a4 /dev/sdl1 1.9T 1.4T 492G 74% /array/b2 /dev/sdh1 1.9T 985G 879G 53% /array/a2 /dev/sde1 917G 322G 595G 36% /array/a5 /dev/sdn1 3.7T 181M 3.7T 1% /array/c6 /dev/sda1 3.6T 1.7T 1.8T 49% /rinex /dev/sdy1 1.8T 1.5T 364G 81% /array/b5 tmpfs 9.3G 0 9.3G 0% /run/user/1000 /dev/sdc1 932G 157G 776G 17% /cal
If we look at /array/ar, mount reports:
mount | grep a1 /dev/sdi1 on /array/a1 type ext4 (rw,relatime,stripe=8191,data=ordered)
If I look at the systemd files created in /run/systemd/transient, they look like this (for /array/a1):
# This is a transient unit file, created programmatically via the systemd API. Do not edit. [Unit] After=dev-sdi1.device [Unit] BindsTo=dev-sdi1.device [Mount] What=/dev/sdi1
[Unit] Requires=systemd-fsck@dev-sdi1.service [Unit] After=systemd-fsck@dev-sdi1.service
I can verify that there is no entry in by-path when a drive bay is empty. It is of course still there after it is umounted. But when it is physically removed, the by-path entry goes away.
Next time I can swap some disks (the system is a production data processing system so I cannot just play with it), I can look a bit closer at the file created in /run/systemd/transient
Anything else to check?
-- Roger Oberholtzer
-- Roger Oberholtzer -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org