RE: [opensuse] possible raid failure?
James wrote: Feb 4 04:49:44 server kernel: end_request: I/O error, dev sda, sector 90593092 Feb 4 04:49:44 server kernel: Operation continuing on 1 devices Feb 4 04:49:44 server kernel: RAID1 conf printout: Feb 4 04:49:44 server kernel: --- wd:1 rd:2 Feb 4 04:49:44 server kernel: disk 0, wo:1, o:0, dev:sda2 Feb 4 04:49:44 server kernel: disk 1, wo:0, o:1, dev:sdb2 Feb 4 04:49:44 server kernel: RAID1 conf printout: Feb 4 04:49:44 server kernel: --- wd:1 rd:2 Feb 4 04:49:44 server kernel: disk 1, wo:0, o:1, dev:sdb2> <snip>
Is the raid failing? This is a new system.
Hi James, your RAID has failed. Check the output 'cat /proc/mdstat' to make sure, but it seems pretty obvious. ~~~~~~~~~~~~~~~~~~` Anyway to mark the sector as bad and restart the raid? This is a single drive mirrored (RAID1) and when the raid failed the system froze. I don't want that to happen again. What can I do to ensure that the system can ride through a RAID failure and how can I have the md send me an e-mail notifying me of the failure. After a reboot mdstat shows; # cat /proc/mdstat Personalities : [raid1] [raid0] [raid5] [raid4] [linear] md0 : active raid1 sda1[0] sdb1[1] 160512 blocks [2/2] [UU] md1 : active raid1 sdb2[1] 292872896 blocks [2/1] [_U] <snip> I don't see sda2 on md1 so I guess its no longer mirrored. Any way to reactivate it while the raid is still up? Thank you, James -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hi James, On Wed, 04 Feb 2009, 19:35:36 +0100, James D. Parra wrote:
[...] Anyway to mark the sector as bad and restart the raid?
you should *immediately* run an intensive check on the drive; as root, run "smartctl -t long /dev/sda" which will last several minutes to hours depending on the size of the drive. Once it's finished you can see the results by running "smartctl -a /dev/sda" which may also show the LBA address of the failing block(s); there are some references (use Google) explaining what to do to mark (and remap) the sector.
This is a single drive mirrored (RAID1) and when the raid failed the system froze. I don't want that to happen again. What can I do to ensure that the system can ride through a RAID failure and how can I have the md send me an e-mail notifying me of the failure.
That's pretty easy, just run "insserv /etc/init.d/mdadm" and ensure that any e-mail sent to "root@localhost" will be delivered to some account who's actually looking at that e-mail.
After a reboot mdstat shows;
# cat /proc/mdstat Personalities : [raid1] [raid0] [raid5] [raid4] [linear] md0 : active raid1 sda1[0] sdb1[1] 160512 blocks [2/2] [UU]
md1 : active raid1 sdb2[1] 292872896 blocks [2/1] [_U] <snip>
I don't see sda2 on md1 so I guess its no longer mirrored. Any way to reactivate it while the raid is still up?
I'd not (yet) re-add it to the raid, better check the drive first as I wrote above. Afterwards you can re-add it using "mdadm /dev/md1 /dev/sda2".
Thank you,
James
HTH, cheers. l8er manfred -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
James D. Parra wrote:
Anyway to mark the sector as bad and restart the raid?
It's more than a bad sector, the drive will take care of those itself. Without having examined the errors you posted in any detail, they look like some sort of interface error.
This is a single drive mirrored (RAID1) and when the raid failed the system froze.
That's odd.
I don't want that to happen again. What can I do to ensure that the system can ride through a RAID failure and how can I have the md send me an e-mail notifying me of the failure.
Look up mdadm. That's exactly what it does.
After a reboot mdstat shows;
# cat /proc/mdstat Personalities : [raid1] [raid0] [raid5] [raid4] [linear] md0 : active raid1 sda1[0] sdb1[1] 160512 blocks [2/2] [UU]
md1 : active raid1 sdb2[1] 292872896 blocks [2/1] [_U] <snip>
I don't see sda2 on md1 so I guess its no longer mirrored. Any way to reactivate it while the raid is still up?
mdadm --manage /dev/md1 --add /dev/sda2 It's odd that your RAID1 setup doesn't list sda2 as failed. -- Per Jessen, Zürich (4.50°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (3)
-
James D. Parra
-
Manfred Hollstein
-
Per Jessen