Thanks a LOT for your detailed analysis Andrei! Andrei Borzenkov wrote:
24.10.2017 12:31, Peter Suetterlin пишет:
Andrei Borzenkov wrote:
23.10.2017 20:16, Peter Suetterlin пишет:
So to sum up again:
If you want any useful help show full log (journal -b).
Oct 23 07:44:09 royac6 systemd[1]: Started Timer to wait for more drives before activating degraded array.. Oct 23 07:44:09 royac6 systemd[1]: Stopped Timer to wait for more drives before activating degraded array..
This was in initrd and is correct. Timer is started, array is assembled, device appears, timer is stopped.
Oct 23 07:44:13 royac6 systemd[1]: Started Timer to wait for more drives before activating degraded array..
And that is unexpected. There should be no reasons to start timer at this point, unless there is some race condition or bug in mdadm. What happens further is probably direct consequence of this.
Ah. I had also assumed there's no issue with md0 because this is already handled during the initrd phase, as it contains the system. I assumed it would now start that timer for the second RAID, md1, which is /home. But the timers are a systemd thing, is it?
Oct 23 07:44:43 royac6 systemd[1]: Starting Activate md array even though degraded...
In Leap 42.2 this service has Conflicts against device node. When systemd attempts to start this service, it also attempts to "stop" device node. While this is not possible, it will cause anything depending on this device node (mount point in the first place) to be stopped as well.
Ouch. That is nasty.
So you get
Oct 23 07:44:43 royac6 systemd[1]: Started Activate md array even though degraded. Oct 23 07:44:43 royac6 systemd[1]: Stopped target Local File Systems. Oct 23 07:44:43 royac6 systemd[1]: Unmounting /home... Oct 23 07:44:43 royac6 systemd[1]: Stopped (with error) /dev/md1. Oct 23 07:44:43 royac6 systemd[1]: Unmounted /home.
As for Postfix I suspect it has dependency on /home.
Yes, it needs $HOME/Maildir for mail delivery. So stopping both services is 'reasonable' if you umount /home. The questions are why it's unmounted, and why the services are not started after the second mount.
Given that it is likely one-off condition rebooting once a year is unlikely to provide much useful information.
Well, I can of course reboot for a test, just have to time it nicely with the users. I'll probably have a try later this evening.
Besides in 42.3 service was changed to not conflict, instead it is using conditional to skip starting.
Partially good news then - an update of the system is already planned.
It would be interesting to (try to) understand how timer gets enabled. Could you provide output of "udevadm info --export-db" (not as compressed file, please)?
I tried paste.opensuse.org, but it didn't like the size (7k lines...). So again here: http://www.royac.iac.es/~pit/Stuff/udev_info.txt I did create a bugreport, too: https://bugzilla.opensuse.org/show_bug.cgi?id=1064887 Thanks again, Pit -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org