On 2017-07-11 23:16, Greg Freemyer wrote:
On Tue, Jul 11, 2017 at 4:45 PM, Carlos E. R. <> wrote:
On 2017-07-11 22:35, Dave Howorth wrote:
On Tue, 11 Jul 2017 22:18:03 +0200 "Carlos E. R." <> wrote:
For those who don't know, a desktop drive is "within spec" if it returns one soft read error per 10GB read. In other words, read a 6TB drive end-to-end twice, and the manufacturer says "if you get a read error, that's normal". But it will cause an array to fail if you haven't set it up properly ...
What would one do to set them up properly? :-?
You need to set up the timeouts in Linux to be longer than the ones imposed by the firmware on the drive. I'm sorry but I don't remember whether it's a kernel thing or a mdadm thing. I expect the linux raid wiki knows.
Wow :-(
That can be minutes.
That's actually a big difference between "set-up properly" or not.
A drive used in a raid-1/5/6 should be set to fail fast instead of retry for a minute or two.
Drives designed for use in a raid array will come from the factory that way.
If you're using a desktop drive you really need to try and set the retry time down low.
Where do you do that? "hdparm", perhaps? I just had a look at the man page, didn't locate a setting for it. I don't use raid in production, so to speak, I prefer backups for my use case. But I like learning :-)
Then, in addition you should be running a scrub routinely. That will do a read of all the sectors on the physical media looking for bad sectors. If it finds any, it recreates the data from the other drives and rewrites the sector. Hopefully that fixes it.
Well, a surface test like the one triggered by "smartctl" could do. If not, I could use "badblocks" on the entire disk: I have reason to believe the firmware relocates sectors on running it. More than once I have seen errors listed by SMART (bad sector). I try to locate the sector with badblocks, and they disappear. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)