On Fri, Sep 4, 2020 at 10:31 AM Andrei Borzenkov
On Fri, Sep 4, 2020 at 11:10 AM Roger Oberholtzer
wrote: On Fri, Sep 4, 2020 at 7:41 AM Andrei Borzenkov
wrote: I see in /tmp/damount.log that the by-path is pointing to a new sdx1 value. But it is some other sdx1 that is mounted.
Using systemd-mount from within udev RUN rule creates race condition - systemd-mount tells systemd to mount this device, but this device becomes known to systemd only after RUN commands has been executed.
I guess. That is why I listed where the by-path was pointing when the RUN command was executing. I suspected that it was not correct yet when the RUN command was executing. But I saw that it was in fact correct.
So if the by-path entry points to the correct sdX1 when the RUN command runs, what else could systemd-mount or it's minions be looking at? Some other information other than the current definition of the device as it is in /dev?
Actually I think I was wrong. systemd-mount passes all parameters directly to systemd which creates a mount unit and queues start job for it. By default mount unit has dependency on its device so start job will wait until device becomes known to systemd.
But it also means actual mount happens later. Are you absolutely sure these path names are unique and persistent at every point in time?
Yes. That was my original thought. It is a SuperMicro computer with a 24 disk array. We have used udevadmin to see what things happen when a disk is inserted. We recorded the by-path value for activity on each drive bay. Those were used to make the udev rules like the one in the original message. There are 24 rules, one for each drive bay. In our RUN script, we print the by_path value when we insert a disk. It is always the expected one for a specific physical bay. Of course, the sdX that this points to changes each time, depending on which names are available. I should add that we explicitly umount a disk before it is to be removed. One cannot have that in a udev rule... So there is nothing in our udev rules related to a remove action. I'm not sure what we could put there.
You may consider using SYSTEMD_WANTS in your rule instead which points to service template; pass device name as parameter.
I do not know anything about this. I will explore. Thanks for the pointer.
No, I do not think it will change anything.
I do not say it is necessary the reason for your problem, but as you also did not show any real information there is nothing that would allow to guess.
I do not know what other information to provide. I tried to be complete. If you can suggest some other information I will be happy to provide it.
Which is why I usually ask customers to generate supportconfig - because I also do not know in advance what may be relevant :)
More serious - show actual mount units, mounted filesystems, device links and udev database before you insert disk. Enable systemd debug (see man systemd, SIGRTMIN+23 IIRC). Insert disk. Collect the same information again to show the difference. Collect full journalctl output (journalctl -b) that includes debug information. Tells what exact disk you have inserted to give something to look for.
As an example, this is the current state: Filesystem Size Used Avail Use% Mounted on devtmpfs 47G 0 47G 0% /dev tmpfs 47G 184K 47G 1% /dev/shm tmpfs 47G 19M 47G 1% /run tmpfs 47G 0 47G 0% /sys/fs/cgroup /dev/sdb2 217G 173G 34G 84% / /dev/sdt1 1.8T 22G 1.8T 2% /array/d6 /dev/sdk1 1.8T 1.4T 452G 76% /array/b3 /dev/sdv1 1.8T 1.4T 417G 78% /array/d3 /dev/sdg1 1.8T 1020G 813G 56% /array/a3 /dev/sdq1 1.8T 1.1T 750G 60% /array/c3 /dev/sdd1 1.8T 1.5T 372G 80% /array/a6 /dev/sdp1 1.8T 1.1T 749G 60% /array/c4 /dev/sdx1 917G 158G 759G 18% /array/d1 /dev/sdu1 1.8T 1.2T 611G 67% /array/d4 /dev/sds1 917G 181G 735G 20% /array/c1 /dev/sdj1 1.8T 1.4T 439G 77% /array/b4 /dev/sdm1 917G 160G 757G 18% /array/b1 /dev/sdi1 917G 163G 754G 18% /array/a1 /dev/sdr1 1.9T 1.1T 747G 60% /array/c2 /dev/sdo1 1.9T 17G 1.9T 1% /array/c5 /dev/sdw1 1.9T 1.3T 601G 68% /array/d2 /dev/sdf1 1.8T 965G 867G 53% /array/a4 /dev/sdl1 1.9T 1.4T 492G 74% /array/b2 /dev/sdh1 1.9T 985G 879G 53% /array/a2 /dev/sde1 917G 322G 595G 36% /array/a5 /dev/sdn1 3.7T 181M 3.7T 1% /array/c6 /dev/sda1 3.6T 1.7T 1.8T 49% /rinex /dev/sdy1 1.8T 1.5T 364G 81% /array/b5 tmpfs 9.3G 0 9.3G 0% /run/user/1000 /dev/sdc1 932G 157G 776G 17% /cal If we look at /array/ar, mount reports: mount | grep a1 /dev/sdi1 on /array/a1 type ext4 (rw,relatime,stripe=8191,data=ordered) If I look at the systemd files created in /run/systemd/transient, they look like this (for /array/a1): # This is a transient unit file, created programmatically via the systemd API. Do not edit. [Unit] After=dev-sdi1.device [Unit] BindsTo=dev-sdi1.device [Mount] What=/dev/sdi1 [Unit] Requires=systemd-fsck@dev-sdi1.service [Unit] After=systemd-fsck@dev-sdi1.service I can verify that there is no entry in by-path when a drive bay is empty. It is of course still there after it is umounted. But when it is physically removed, the by-path entry goes away. Next time I can swap some disks (the system is a production data processing system so I cannot just play with it), I can look a bit closer at the file created in /run/systemd/transient Anything else to check? -- Roger Oberholtzer -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org