Linda Walsh <suse@tlinx.org> wrote:
Greg Freemyer wrote:
Linda Walsh <suse@tlinx.org> wrote:
Somehow this topic seems to have migrated from how to do disk to
disk
copy w/o using command line (and so many of us try to tell him the command line is by far the best for something this simple), to dealing with bad sector in a source disk... which fortunately for me, is a rare situation.
Linda,
I do this procedure as part of my day job.
My condolences! ;-)
(That's why I packaged ewfacquire, I use it routinely.)
I am sure it is better for treating bad disks than 'dd'
The "subject drives" which I read from are a random collection of customer drives. They can be a almost new drive in a new machine, all the way to a 10 year old drive in a computer sitting out in a shed that was almost forgotten about. Most are from desktop/laptop PCs a few years old and routinely in use.
I don't keep stats, but I would guess between 5 and 10% of them have at least one bad sector. Having a significant number of bad sectors I agree is rare, but having one or two I would say is almost routine.
---- Um, are you saying a typical user should expect to see 1-2 disk errors on a disk->disk copy?
I'm saying 90% of the time they should see 0 bad sectors, but 1 or 2 the other 10% of the time is expected and normal.
Isn't it fair to say that many consumer level drives not only have 1-2 disk errors, but already have such sectors remapped to per-track spares when new? For that matter, if it isn't a drive nearing it's "end of useful life", would you expect users to actually see or notice such an error -- or wouldn't it be handled by the drives internal firmware -- w/recovery via internal ECC and remapping all handled on the fly? Isn't, by 'SMART' standards, a drive at the end of its useful lifespan when it can no longer automatically relocate such data automatically?
You don't seem to understand the bad sector life cycle. Here you go: - Sector magnetism becomes degraded. - Time passes (hours, days, years) - A sector read of that specific sector occurs. - A sector error is returned to the userspace app and smart marks the sector bad - all subsequent reads continue to attempt to read from the same bad sector. That is they typically fail as well. - time passes (seconds, days, years) - a write to the bad sector occurs - drive controller notes this sector is bad, time to remap it - sector remapped and new data written - reads now succeed
Isn't it *normally* the case that a user will only see disk errors on a drive that can no longer remap sectors?
No, see the life cycle and the 2 different places an error can exist for years. Most raid1/5/6 solutions run scrubbers once a month or so to detect and correct these failures without degrading the raid array. Ie. They force the reading of every sector of the array. When an error is found they use the raid correction logic to calculate what it should have been and write the valid data back out. It is the write that triggers the remap. Thus a freshly scrubbed raid array should have 0 bad sectors, but a failure may occur before the next scrub. That is why you want to run a scrub every month or so. Note that for the raid scenario you don't want the drive to automatically retry failed reads. Instead you want it to try exactly once and fail immediately if there is a problem. That lets the raid solution handle it. Most will immediately write out the calculated data and thus fix the bad sector as soon as it is detected. Therefore a raid edition sata drive is actually less reliable than a consumer edition drive. The only difference is the firmware. It is the raid solution that makes the overall system more robust. That means if you are buying drives for a raid, then definately buy a raid edition drive, but if it is for standalone use avoid raid edition drives, they don't have auto retry logic built-in. Back to consumer systems. If a sector holds changing data, then the error will be found and corrected quickly, but assume it is part of unallocated space near the end of the drive. With a modern 500 gb disks, many of the sectors at the end of the drive will NEVER be read or written in the life of the drive under normal conditions. Bad sectors there just sit there in a failed state forever.
In fact, I think the spec for new drives is no more than one bad sector in 10e10. 500gb drives have a billion sectors, so even with brand new drives having 1 in 10 have a bad sector would be in spec.
---- But you are talking raw sectors -- not formatted capacity, no? Wouldn't the MTBF say, a new, 5-year warranty Hitachi 4TB drive rated at 1-2 million hours (for DeskStar V. Ultrastar models) sort of imply that most users will never see a disk error during the useful life of that disk?
No. Most new drives today do have zero bad sectors, but run smart on a several year old drive and you will rarely find one with zero remapped sectors. For it to be remapped it had to report bad at some point. That means a media error was reported at least once for every remapped sector reported by smart. (Remember the drives / kernel have retry logic so the error may not propagate to user space so even a tool like dd won't see all the bad sectors.) Greg -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org