Mailinglist Archive: opensuse (626 mails)

< Previous Next >
Re: [opensuse] Login weirdness
On 02/11/2018 18.40, Liam Proven wrote:
On 02/11/2018 15:57, Carlos E. R. wrote:
All hard disks develop errors. Operating systems know that, and can mark
the bad sectors in order to just not use them. Modern (since years)
disks can remap bad sectors to other sectors that are reserved for that
purpose since manufacture date. This is done automatically by the
firmware when writing to that bad sector.

This is true. All SATA hard disks do this, and all later *E*IDE hard
disks did to too.

(SSDs are different and weird.)

This parameter says that there
are a number of sectors that have not being remapped.

To me, that is a danger sign. I don't know exactly what it means or why
but it's worrying.


Concurrent to this, notice that there are several "extended offline"
tests that did not complete, all at the same LBA. I would rewrite that LBA.
[...]

I am afraid I must disagree.

You can also run "badblocks" on that disk[...]

OK, I must disagree more.

The first hard disk I bought came with a bad sector list printed in
paper by the manufacturer. All disks came with that at that time. It was
30 megabytes big, a huge disk at that time.

Well, the first one I used at work, yes. 20 MB and I added a 2nd 15MB
drive to the machine for SCO Xenix 286.

But times have changed a lot. The last new disk I bought was a 1TB
notebook hard disk, a quarter the size of a deck of cards.

This would need a pile of those Conner 225 ST-506 interface drives that
I mentioned just now the size of (and a *lot* heavier than) a Space
Shuttle to store an equivalent amount of data.

This shook me at the time.

All hard drives have some bad sectors, it's true. Most develop more
during their operational life, also true.

But they have a pretty large reserved area(maybe 10-15%, it varies a lot
with model and makers don't like to disclose it. An >1% number of
blocks, anyway.) and failed blocks are replaced from the spare blocks.

This remapping is normal and invisible. The OS never knows there was a
read error, it's just switched on the fly.

This is what I have been saying :-)

If the OS can see errors, that means that either [a] the disk's
replacement blocks are used up, meaning it has millions of bad blocks,
or [b] the disk is defective in some other way.

Well, no. :-)

This parameter:

5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always
- 0

is precisely how much of that area has been used. "100" means empty,
"10" means "emergency", replace ASAP. The numbering in SMART goes down.
The "raw" number is "0", which is, we suppose, the normal number humans
would use.

*The remapping only happens during sector write.*

If the sector belongs to a file which is never written, the error
remains for ever, not mapped out. That is the main reason that the count
"Current_Pending_Sector" goes up.


Which is why the user most force a write to the bad LBA to make it being
mapped out.

--
Cheers / Saludos,

Carlos E. R.
(from 42.3 x86_64 "Malachite" at Telcontar)

< Previous Next >