Neil Brown changed bug 1033098
What Removed Added
CC   zlliu@suse.com
Flags   needinfo?(zlliu@suse.com)

Comment # 8 on bug 1033098 from
(sorry for the 6 month delay)

The "-15" error is -EBUSY.  That means md is trying to get exclusive access to
the device, but something else already has that access.

mdadm open the device with O_EXCL to get exclusive access, but closes it just
before adding the device to the array:

    close(nfd);

    if (ioctl(fd, ADD_NEW_DISK, &info.disk) != 0) {
        pr_err("Cannot add new disk to this array\n");

As nfd was open for write access, the 'close' will cause udev to trigger a
'change' event. That could cause some process to run and open the device,
possibly with O_EXCL.  If this process races with the kernel acting on
ADD_NEW_DISK, one of them will lose the race.  The error you see if the kernel
losing the race.

We could possibly call
 system("udevadm control --stop-exec-queue");
before closing the fd, and
 system("udevadm control --start-exec-queue");
after the ADD_NEW_DISK completed, but that feels rather clumsy.

It is almost certainly "mdadm -I" which gets run by udev when the fd is closed.
 We can prevent that using map_lock().
i.e. declare

struct map_ent *map = NULL;

Then before close(nfd); run
 map_lock(&map);
and after the ioctl, run
 map_unlock(&map);

Can you try making that change by hand (assuming you can still reproduce the
problem) or would you like me to provide a patch?


You are receiving this mail because: