Mailinglist Archive: opensuse-support (97 mails)

< Previous Next >
Re: [opensuse-support] RAID1 disk pending sectors
On 08/11/2018 12.41, Dave Howorth wrote:
On Thu, 8 Nov 2018 11:14:05 +0100
"Carlos E. R." <robin.listas@xxxxxxxxxxxxxx> wrote:
On 08/11/2018 06.07, Felix Miata wrote:
Felix Miata composed on 2018-11-07 06:39 (UTC-0500):

Carlos E. R. composed on 2018-11-07 10:35 (UTC+0100):

On 07/11/2018 04.48, Felix Miata wrote:

# journalctl -b -e
Nov 06 21:54:55 00srv smartd[937]: Device: /dev/sdc [SAT], 8
Currently unreadable (pending) sectors # fdisk -l /dev/sdc;
smartctl -x /dev/sdc
shows Current_Pending_Sector raw value is 8 after only 1660
power on hours. :-(
Basically, your disk fails consistently on LBA 142446713, read
error. You have to find out what is there and rewrite that

# smartctl -t long /dev/sdc
is now running

I missed the section pointing to the LBA. 142446713 is on sdc8,
which happens to be md3, which is on /home. There remains to
identify the file or structure that uses 142446713.

That same secton has this puzzling line:
#10 Short offline Completed without error 00% 43301 -
That's failure at a lifetime of 43301 hours on a disk the appears
to have only 1660 power on hours.

# cat /proc/mdstat
md3 : active raid1 sdb8[0] sdc8[1]
73727872 blocks super 1.0 [2/2] [UU]
bitmap: 1/1 pages [4KB], 65536KB chunk
# fdisk -l /dev/sdc
Device Start End Sectors Size Type
/dev/sdc8 61171712 208627711 147456000 70.3G Linux RAID

Shouldn't the following process force LBA 142446713 to be

fail sdc8 from md3
remove sdc8 from md3
dd if=/dev/zero of=/devsdc8 bs=32768
add sdc8 to md3

It seems it would be simpler than trying to figure out which file
or inode uses it.

You do not need the file or inode, you already have the LBA. read that
sector to a file. If it fails, use dd-rescue. Then write back that

Being a raid, you may get the sector from sdb8, if the partition
layout on sdb is the same.

dd if=/dev/sdc of=lbasectorbackup bs=512 count=1 seek=142446713
dd if=lbasectorbackup of=/dev/sdc bs=512 count=1 seek=142446713

Err, shouldn't that be

dd if=/dev/sdb of=lbasectorbackup bs=512 count=1 seek=142446713
dd if=lbasectorbackup of=/dev/sdc bs=512 count=1 seek=142446713

Yes, if sdc can not be read. I mentioned using ddrescue.

But I agree with the principle. There's no point in replacing a block
in one copy with zeros. It's then Russian roulette for ever more
whether that file is corrupt or not when you read it. (Unless the
filesystem has checksums or suchlike, of course). The RAID won't help.

In this case the raid does protect the data. The sda side should be
correct, only the sdc side has an error. Thus removing sdc, erasing it
whole, and adding it back to the raid should work, as it will restore
from sdb.

Alternatively, just copy the data sector from sdb into sdc, assuming
there is no raid metadata involved. I did not think of that part.

Cheers / Saludos,

Carlos E. R.

(from openSUSE 15.0 (Legolas))

< Previous Next >