Mailinglist Archive: opensuse (911 mails)

< Previous Next >
Re: [opensuse] strange mdraid problem
On 12/07/2015 11:52 AM, Istvan Gabor wrote:
OK. How do I know that only one device is visible? I have several arrays and
both devices are visible and assembled in the resynched group. What you're
saying is that in case of ~10 arrays (all have been resynched after the
failure) both devices are always visible (dev/sda* and /dev/sdb*) but in case
of not synched (and only in not synched) either /dev/sda* or /dev/sdb* is
visible alternatively at different boots. How can I confirm that this causes
the problem?

Istvan,

I had a similar issue where I had a disk controller that was flaky. I still do not know exactly how it happened, but apparently on one boot, the array booted into degraded mode and did not see the other disk. When that occurred, it continued to write to the good disk as it normally would. On next boot, the other disk re-appeared and dmraid was stuck. It saw both metadata saying they were fine and if the event counts are not that far off, it doesn't know which is the good disk. It should kick one out and continue on the one with the most recent event.

To recover, you 'fail' and 'remove' the bad device (or the one dmraid thinks is bad), Make sure you fail the *correct* partition, e.g.:

# mdadm /dev/md1 --fail /dev/sdb5 --remove /dev/sdb5
mdadm: set device faulty failed for /dev/sdb5: No such device

*note:* since mdadm has already kicked the drive, you will receive the 'No such device' warning above (this is normal).

Then re-'add' the device:

# mdadm /dev/md1 --add /dev/sdb5
mdadm: re-added /dev/sdb5

That will start the resync. Good luck.

--
David C. Rankin, J.D.,P.E.
--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse+owner@xxxxxxxxxxxx

< Previous Next >
Follow Ups
References