On 2017-10-23 19:16, Peter Suetterlin wrote:
Sorry - I was busy with other things...
Carlos E. R. wrote:
On 2017-10-23 13:05, Peter Suetterlin wrote:
Yes, as I said, it's the server for home directories, mounts them from the RAID and then exports via NFS.
Yes, Per clarified this part.
If you look at the timestamps, this is 30 seconds *before* it stops/unmounts /home and claims the disk is missing. Sorry for posting them out-of-sync.
Ah!! Uff... Please, don't ever do that. Or if you do, please say the logs are not in order, clearly. Please, can you repost a longer part of the log, all in correct order? Even better, the full (minutes) log, from boot till minutes later when you login? I can try have a go at it after reordering. Oct 23 07:44:09 royac6 kernel: sdb: sdb1 Oct 23 07:44:09 royac6 kernel: sda: sda1 Oct 23 07:44:12 royac6 kernel: md: bind<sda1> Oct 23 07:44:13 royac6 kernel: md: bind<sdb1> Oct 23 07:44:13 royac6 kernel: md/raid1:md1: active with 2 out of 2 mirrors Oct 23 07:44:13 royac6 kernel: created bitmap (8 pages) for device md1 Oct 23 07:44:13 royac6 kernel: md1: bitmap initialized from disk: read 1 pages, set 11 of 15260 bits Oct 23 07:44:13 royac6 kernel: md1: detected capacity change from 0 to 1024061145088 The raid is assembled. The "capacity change" I don't understand, maybe the disks are external? Oct 23 07:44:13 royac6 systemd[1]: Found device /dev/disk/by-uuid/133b616a-1100-4278-86a7-9eb677783e9b. Which disk is this one? Oct 23 07:44:13 royac6 systemd[1]: Mounting /home... Oct 23 07:44:13 royac6 kernel: EXT4-fs (md1): 1 orphan inode deleted Oct 23 07:44:13 royac6 kernel: EXT4-fs (md1): recovery complete Oct 23 07:44:13 royac6 kernel: EXT4-fs (md1): mounted filesystem with ordered data mode. Opts: discard Oct 23 07:44:13 royac6 systemd[1]: Mounted /home. Apparently it detects an error on the home filesystem and does recovery on it, before actually mounting it. Shouldn't there be more log entries here? There is a hole, half a minute. Oct 23 07:44:43 royac6 systemd[1]: Stopped Postfix Mail Transport Agent. Oct 23 07:44:43 royac6 systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice. Oct 23 07:44:43 royac6 systemd[1]: Starting Activate md array even though degraded... Oct 23 07:44:43 royac6 systemd[1]: Stopped NFS server and services. Oct 23 07:44:43 royac6 systemd[1]: Stopping NFSv4 ID-name mapping service... Oct 23 07:44:43 royac6 systemd[1]: Stopped NFS Mount Daemon. Oct 23 07:44:43 royac6 systemd[1]: Stopped NFSv4 ID-name mapping service. Oct 23 07:44:43 royac6 systemd[1]: Started Activate md array even though degraded. Oct 23 07:44:43 royac6 systemd[1]: Stopped target Local File Systems. Oct 23 07:44:43 royac6 systemd[1]: Unmounting /home... Oct 23 07:44:43 royac6 systemd[1]: Stopped (with error) /dev/md1. Oct 23 07:44:43 royac6 systemd[1]: Unmounted /home. Yes, here it umounts /home because the array is degraded, but the reason of the degradation is missing. Oct 23 07:44:44 royac6 systemd[1]: Stopped Timer to wait for more drives before activating degraded array.. Oct 23 07:44:44 royac6 systemd[1]: Found device /dev/disk/by-uuid/133b616a-1100-4278-86a7-9eb677783e9b. Same device as before. What is it? And then mounts home again, possibly degraded. Oct 23 07:44:44 royac6 systemd[1]: Mounting /home...
So to sum up again: Kernel detects both disks/partitions, md properly fires up the RAID clean, mounts it and recovers from an orphaned inode. Then suddenly systemd decides that there is a disk missing and unmounts again, just to 'find' the RAID again directly after that.
We need to see the complete boot log, without "quiet", and also the list of disks.
lsblk --bytes --output NAME,KNAME,RA,RM,RO,SIZE,TYPE,FSTYPE,LABEL,PARTLABEL,MOUNTPOINT,UUID,PARTUUID,WWN,MODEL,ALIGNMENT /dev/sd?
you can post those to susepaste.org for a limited time, and post here the link. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)