The Thursday 2005-01-13 at 15:14 +0200, Hylton Conacher (ZR1HPC) wrote:
My understanding is that although SMART will check the physical disk, e2fsck will check that the data is able to be written to the physical hdd, and of course according to the fs. So therefore eventhough I have SMART enabled, there might still be a case where data cannot be written to the hdd, resulting in a failed fsck on that partition on bootup.
fsck tests the partition logically, not physically. It can also run a badblock check (in some filetypes), but is certainly not as complete in that respect as smart.
Just to clarify, would a bad block be a physical defect or a logic error ie the fs thinks the physical media is bad but it isn't? Does SMART technology take care of looking after the physical state of the disk, bad blocks included?
Ok, it goes like this: 1) There is a physical error in the media, meaning that what the kernel tries to write is not the same as what it reads back. 2) "Something" marks the block(s) containing that error as bad. Note: the reiserfsck program since SuSE 9.? can also mark badblocks. Version 8.2 could not. By logical error I mean some kind of indexing error, data that it is not where it should, etc. A software error, like the kernel failing to write something to disk. These errors are detected and corrected only by fsck.
IF SMART doesn't check for bad blocks,
The program smartctl fires a program residing in the HD bios itself, and that does detect bad blocks (depending on the manufacturer). It does not repair them. However, a modern hard disk can remap bad blocks to somewhere else reserved by the manufacturer. This is transparent to the OS, but it is triggered only when writing to a sector that is detected at that moment to be unwritable reliably.
then in theory fsck should check for bad blocks
By default, it does not.
as logic would say writing data to those bad blocks will result in data loss? Bad block checking can be implemented on a ext3 fs with e2fsck but I wonder why the bootup fsck doesn't do a bad blocks check?
Because there is no need, and because it is terribly time consuming, a matter of several hours. You are too paranoid about them, I think :-)
mmmm, Running the following: man fsck.ext3 brings up the e2fsck man page
Yes.
I would like to run the e2fsck command to prevent the failure of partition checking by fsck on bootup as reding the man page on fsck it does not seem up to working on a ext3 fs.
Then just force a check during boot, by creating the file "/forcefsck". An ext3 partition will be checked to the needed level, not more. Doing a badblock check everytime is an overkill, and will not really protect your data. That '/forcefsck' option is a little strong
Why? It just tells the '/etc/init.d/boot.localfs' that you want to check the filesystems regardless of whether it is needed or not. It doesn't do anything "drastic".
but see the next paragraph for my suggestion. Why have the 'bad block' option and why will it not protect my data? Surely it will make sure that data is not lost because the block has been marked as bad and therefore the data will be written to a good block?
It only detects what new sectors are bad at that moment. What if the error develops later, while the system is running? In fact, while running the HD is more vulnerable, because the heads are not parked, but flying at a very small distance from the HD surface.
For a somewhat more complete check, boot from the rescue CD and test from there. I was thinking more along the lines of possibly aliasing the boot fsck to e2fsck and having it run e2fsck each time the fsck is supposed to run on a partition set with the tune2fs cmd ie every 3rd mount or 15 days etc.
fsck calls e2fsck for you. Playing with that is dangerous, because at some time you may have a differently formated partition and apply the wrong program. Look: SuSE people are quite expert and wise, and they have designed those scripts with a lot of thought and care. You really do not have to modify them. If you are worried about bad blocks, do: 1) configure smartctld to run tests periodically on the background. 2) Keep your backups current. 3) If you really need it, use raid setups. -- Cheers, Carlos Robinson