Mailinglist Archive: opensuse-support (97 mails)

< Previous Next >
Re: [opensuse-support] RAID1 disk pending sectors
On Thu, 8 Nov 2018 11:14:05 +0100
"Carlos E. R." <robin.listas@xxxxxxxxxxxxxx> wrote:
On 08/11/2018 06.07, Felix Miata wrote:
Felix Miata composed on 2018-11-07 06:39 (UTC-0500):

Carlos E. R. composed on 2018-11-07 10:35 (UTC+0100):

On 07/11/2018 04.48, Felix Miata wrote:

# journalctl -b -e
...
Nov 06 21:54:55 00srv smartd[937]: Device: /dev/sdc [SAT], 8
Currently unreadable (pending) sectors # fdisk -l /dev/sdc;
smartctl -x /dev/sdc
http://fm.no-ip.com/Tmp/Hardware/Disk/smartctlx-msi85-hgst1000.txt
shows Current_Pending_Sector raw value is 8 after only 1660
power on hours. :-(
...
Basically, your disk fails consistently on LBA 142446713, read
error. You have to find out what is there and rewrite that
sector.

# smartctl -t long /dev/sdc
is now running

I missed the section pointing to the LBA. 142446713 is on sdc8,
which happens to be md3, which is on /home. There remains to
identify the file or structure that uses 142446713.

That same secton has this puzzling line:
#10 Short offline Completed without error 00% 43301 -
That's failure at a lifetime of 43301 hours on a disk the appears
to have only 1660 power on hours.

# cat /proc/mdstat
...
md3 : active raid1 sdb8[0] sdc8[1]
73727872 blocks super 1.0 [2/2] [UU]
bitmap: 1/1 pages [4KB], 65536KB chunk
...
# fdisk -l /dev/sdc
...
Device Start End Sectors Size Type
...
/dev/sdc8 61171712 208627711 147456000 70.3G Linux RAID
...

Shouldn't the following process force LBA 142446713 to be
reallocated?

fail sdc8 from md3
remove sdc8 from md3
dd if=/dev/zero of=/devsdc8 bs=32768
add sdc8 to md3

It seems it would be simpler than trying to figure out which file
or inode uses it.

You do not need the file or inode, you already have the LBA. read that
sector to a file. If it fails, use dd-rescue. Then write back that
LBA.

Being a raid, you may get the sector from sdb8, if the partition
layout on sdb is the same.

dd if=/dev/sdc of=lbasectorbackup bs=512 count=1 seek=142446713
dd if=lbasectorbackup of=/dev/sdc bs=512 count=1 seek=142446713

Err, shouldn't that be

dd if=/dev/sdb of=lbasectorbackup bs=512 count=1 seek=142446713
dd if=lbasectorbackup of=/dev/sdc bs=512 count=1 seek=142446713

But I agree with the principle. There's no point in replacing a block
in one copy with zeros. It's then Russian roulette for ever more
whether that file is corrupt or not when you read it. (Unless the
filesystem has checksums or suchlike, of course). The RAID won't help.

I think it is that, verify.

Yes, being a raid you can also overwrite the entire disk.
--
To unsubscribe, e-mail: opensuse-support+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse-support+owner@xxxxxxxxxxxx

< Previous Next >