Re: [SLE] Preventing fs errors -with the e2fsck command? and SMART, e2fsck confusion

13 Jan 2005

      The Thursday 2005-01-13 at 15:14 +0200, Hylton Conacher (ZR1HPC) wrote:
...
...
...
My understanding is that although SMART will check the physical disk,
e2fsck will check that the data is able to be written to the physical
hdd, and of course according to the fs. So therefore eventhough I have
SMART enabled, there might still be a case where data cannot be written
to the hdd, resulting in a failed fsck on that partition on bootup.
fsck tests the partition logically, not physically. It can also run a
badblock check (in some filetypes), but is certainly not as complete in that
respect as smart.
...
Just to clarify, would a bad block be a physical defect or a logic error
ie the fs thinks the physical media is bad but it isn't?
Does SMART technology take care of looking after the physical state of
the disk, bad blocks included?
Ok, it goes like this: 1) There is a physical error in the media, meaning
that what the kernel tries to write is not the same as what it reads back. 
2) "Something" marks the block(s) containing that error as bad.

	Note: the reiserfsck program since SuSE 9.? can also mark
	badblocks. Version 8.2 could not.

By logical error I mean some kind of indexing error, data that it is not
where it should, etc. A software error, like the kernel failing to write
something to disk. These errors are detected and corrected only by fsck.
...
IF SMART doesn't check for bad blocks,
The program smartctl fires a program residing in the HD bios itself, and
that does detect bad blocks (depending on the manufacturer). It does not
repair them.  However, a modern hard disk can remap bad blocks to
somewhere else reserved by the manufacturer. This is transparent to the
OS, but it is triggered only when writing to a sector that is detected at
that moment to be unwritable reliably.
...
then in theory fsck should check
for bad blocks
By default, it does not.
...
as logic would say writing data to those bad blocks will
result in data loss? Bad block checking can be implemented on a ext3 fs
with e2fsck but I wonder why the bootup fsck doesn't do a bad blocks check?
Because there is no need, and because it is terribly time consuming, a
matter of several hours. You are too paranoid about them, I think :-)
...
mmmm, Running the following: man fsck.ext3 brings up the e2fsck man page
Yes.
...
...
...
I would like to run the e2fsck command to prevent the failure of
partition checking by fsck on bootup as reding the man page on fsck it
does not seem up to working on a ext3 fs.
Then just force a check during boot, by creating the file "/forcefsck". An
ext3 partition will be checked to the needed level, not more. Doing a
badblock check everytime is an overkill, and will not really protect your
data.
That '/forcefsck' option is a little strong
Why?

It just tells the '/etc/init.d/boot.localfs' that you want to check the
filesystems regardless of whether it is needed or not. It doesn't do
anything "drastic".
...
but see the next paragraph
for my suggestion. Why have the 'bad block' option and why will it not
protect my data? Surely it will make sure that data is not lost because
the block has been marked as bad and therefore the data will be written
to a good block?
It only detects what new sectors are bad at that moment. What if the error
develops later, while the system is running? In fact, while running the HD 
is more vulnerable, because the heads are not parked, but flying at a very 
small distance from the HD surface.
...
...
For a somewhat more complete check, boot from the rescue CD and test from
there.
I was thinking more along the lines of possibly aliasing the boot fsck
to e2fsck and having it run e2fsck each time the fsck is supposed to run
on a partition set with the tune2fs cmd ie every 3rd mount or 15 days etc.
fsck calls e2fsck for you. Playing with that is dangerous, because at some 
time you may have a differently formated partition and apply the wrong 
program.

Look: SuSE people are quite expert and wise, and they have designed those 
scripts with a lot of thought and care. You really do not have to modify 
them.

If you are worried about bad blocks, do: 1) configure smartctld to run
tests periodically on the background. 2) Keep your backups current. 3) If
you really need it, use raid setups.

-- 
Cheers,
       Carlos Robinson

Re: [SLE] Preventing fs errors -with the e2fsck command? and SMART, e2fsck confusion

Carlos E. R.