How to set a reiserfs partition fsck occasionally?
While the new logging filesystems are a great improvement my experience is that they can't survive forever in the real world without an occasional rebuild or fsck. This list has had warnings by people burnt by reiserfs. I haven't (yet) lost any data but have had some scary times. This hasn't been bugs in reiserfs (3.6) itself as most instability was tracked to (very marginally) flakey RAM. However while the glitches were caused by corrupt RAM they left me with faults in the filesystem, faults that persisted across reboots. These included un-list-able and un-cat-able files. ie: read or ask the size of that file and it's bye-bye to that terminal. It made the whole system unuseable as processes "trod on the cracks" and hung. Backups? Hah, not with that file in the partition. So I think a lot of bad press stems from the misconception that any filesystem can avoid bitrot forever without an fsck. But this is painful to do by hand, I have to boot a rescue system and run reiserfsck by hand, to do the root and system partitions. How can I get back the old behaviour an fsck happening during reboot every x reboots or y days? Or, how can I trigger an "fsck reboot"? TIA, michaelj PS: I've just realized I can do it by adding an fsck into the linuxrc script of a cooked initrd image. That would give me an "fsck boot" option in grub. Comments? -- Michael James michael.james@csiro.au System Administrator voice: 02 6246 5040 CSIRO Bioinformatics Facility fax: 02 6246 5166
Hi Mike, I believe when you use "shutdown -R now" you can force a fsck on reboot. I'd be very interested in how you tracked your reiserfs problems to flakey ram. I recently ran into some major problems when I replaced my DVDrom with a DVD burner on my system. I upgraded at the same tome from SuSE 8.2 to 9.0. First of all, I ended up swapping hardware around a lot because my Abit BX2.0 motherboard didn't recognize the DVD burner except if it occupied the slave on the first IDE channel. At some point in the process my trusted WD 12GB /dev/hda threw in the towel. Even a reiserfsck --rebuild-tree wouldn't rescue it and simply crashed somewhere in the middle of it. I ended up putting two new HDs into the system as well. Anyway, to make a long story short, I ended up with corruptions not only on the old HDs (the one that got toasted to begin with and an external scsi drive) but also on the two new ones. Luckily all those could be repaired by rebuilding the tree. It may be tied to trying to use the new DVD burner (Plextor 708A) to write to a CD-R when these corruptions occur. This in turn could be due to a max'ed out power supply or to the burner being faulty which I am currently trying to find out. Anyway, of course I can't exclude flaky RAM either, although I haven't had any problems with it up till now. So, please share your story and what you did to troubleshoot your problems. Best regards, Alex. On Fri, 30 Jan 2004, Michael James wrote:
While the new logging filesystems are a great improvement my experience is that they can't survive forever in the real world without an occasional rebuild or fsck.
This list has had warnings by people burnt by reiserfs. I haven't (yet) lost any data but have had some scary times.
This hasn't been bugs in reiserfs (3.6) itself as most instability was tracked to (very marginally) flakey RAM.
However while the glitches were caused by corrupt RAM they left me with faults in the filesystem, faults that persisted across reboots.
These included un-list-able and un-cat-able files. ie: read or ask the size of that file and it's bye-bye to that terminal. It made the whole system unuseable as processes "trod on the cracks" and hung. Backups? Hah, not with that file in the partition.
So I think a lot of bad press stems from the misconception that any filesystem can avoid bitrot forever without an fsck. But this is painful to do by hand, I have to boot a rescue system and run reiserfsck by hand, to do the root and system partitions.
How can I get back the old behaviour an fsck happening during reboot every x reboots or y days?
Or, how can I trigger an "fsck reboot"?
TIA, michaelj
PS: I've just realized I can do it by adding an fsck into the linuxrc script of a cooked initrd image. That would give me an "fsck boot" option in grub. Comments?
-- Michael James michael.james@csiro.au System Administrator voice: 02 6246 5040 CSIRO Bioinformatics Facility fax: 02 6246 5166
-- Check the headers for your unsubscription address For additional commands send e-mail to suse-linux-e-help@suse.com Also check the archives at http://lists.suse.com Please read the FAQs: suse-linux-e-faq@suse.com
Adding a larger PSU (with low noise-level is an extremely good idea). I recently add a 500W PSU to an old P2 system I use as a server and general purpose PC and boy did that help ... all HD problems dissappeared just like that (have ReiserFS on all the HD's). And the system actually show all RAM in the count on boot ..... so RAM is affected when you PSU is to small. Hope you get the hint Johan Fredag 30 januar 2004 16:00 skrev Alex Angerhofer:
Hi Mike,
I believe when you use "shutdown -R now" you can force a fsck on reboot.
I'd be very interested in how you tracked your reiserfs problems to flakey ram. I recently ran into some major problems when I replaced my DVDrom with a DVD burner on my system. I upgraded at the same tome from SuSE 8.2 to 9.0. First of all, I ended up swapping hardware around a lot because my Abit BX2.0 motherboard didn't recognize the DVD burner except if it occupied the slave on the first IDE channel. At some point in the process my trusted WD 12GB /dev/hda threw in the towel. Even a reiserfsck --rebuild-tree wouldn't rescue it and simply crashed somewhere in the middle of it. I ended up putting two new HDs into the system as well. Anyway, to make a long story short, I ended up with corruptions not only on the old HDs (the one that got toasted to begin with and an external scsi drive) but also on the two new ones. Luckily all those could be repaired by rebuilding the tree. It may be tied to trying to use the new DVD burner (Plextor 708A) to write to a CD-R when these corruptions occur. This in turn could be due to a max'ed out power supply or to the burner being faulty which I am currently trying to find out. Anyway, of course I can't exclude flaky RAM either, although I haven't had any problems with it up till now. So, please share your story and what you did to troubleshoot your problems.
Best regards, Alex.
On Fri, 30 Jan 2004, Michael James wrote:
While the new logging filesystems are a great improvement my experience is that they can't survive forever in the real world without an occasional rebuild or fsck.
This list has had warnings by people burnt by reiserfs. I haven't (yet) lost any data but have had some scary times.
This hasn't been bugs in reiserfs (3.6) itself as most instability was tracked to (very marginally) flakey RAM.
However while the glitches were caused by corrupt RAM they left me with faults in the filesystem, faults that persisted across reboots.
These included un-list-able and un-cat-able files. ie: read or ask the size of that file and it's bye-bye to that terminal. It made the whole system unuseable as processes "trod on the cracks" and hung. Backups? Hah, not with that file in the partition.
So I think a lot of bad press stems from the misconception that any filesystem can avoid bitrot forever without an fsck. But this is painful to do by hand, I have to boot a rescue system and run reiserfsck by hand, to do the root and system partitions.
How can I get back the old behaviour an fsck happening during reboot every x reboots or y days?
Or, how can I trigger an "fsck reboot"?
TIA, michaelj
PS: I've just realized I can do it by adding an fsck into the linuxrc script of a cooked initrd image. That would give me an "fsck boot" option in grub. Comments?
-- Michael James michael.james@csiro.au System Administrator voice: 02 6246 5040 CSIRO Bioinformatics Facility fax: 02 6246 5166
-- Check the headers for your unsubscription address For additional commands send e-mail to suse-linux-e-help@suse.com Also check the archives at http://lists.suse.com Please read the FAQs: suse-linux-e-faq@suse.com
On Saturday 31 January 2004 03:47, yep@osterbo-net.dk wrote:
RAM is affected when your PSU is to small. Hope you get the hint
I'm aware that a lot of things have to go right for a machine to work. Some other posters suggest using error correcting (parity) ram. Parity is a Good Thing, I remember a Sun that ran for a month patiently complaining about a dud stick, till we got the techs in.
Fredag 30 januar 2004 16:00 skrev Alex Angerhofer:
I believe when you use "shutdown -R now" you can force a fsck on reboot.
I'll try it.
I'd be very interested in how you tracked your reiserfs problems to flakey ram.
Memtest86. First thing to try when a machine displays flakeyness. Ran it all weekend and it reported 1 error. Started elimination, removed second stick of ram, run it long enough, get an error. Swapped sticks, eventually caught an error. Always a different address, different bit or 2. Put it down to bad motherboard, or some subtle timing mismatch between ram (PS2700 DDR) and motherboard (Asus P4 with SIS chipset). Took both sticks to tech support and swapped them for another pair. Ran memtest all weekend, clean. (Yes the week between those weekends was frustrating.) Machine rock solid again, mentally appologised to Suse, Hans etc. Gratefully returned to productive work. BTW, in those dark days I had a Grub entry to boot into memtest86, but lost it an upgrade ago, could someone post the paragraph from /boot/grub/menu.1st that allows this? I have /boot/memtest.bin present but: title memtest86 image (hd0,0)/memtest.bin doesn't do it? -- Michael James michael.james@csiro.au System Administrator voice: 02 6246 5040 CSIRO Bioinformatics Facility fax: 02 6246 5166
The Monday 2004-02-02 at 12:40 +1100, Michael James wrote:
BTW, in those dark days I had a Grub entry to boot into memtest86, but lost it an upgrade ago, could someone post the paragraph from /boot/grub/menu.1st that allows this? I have /boot/memtest.bin present but:
title memtest86 image (hd0,0)/memtest.bin
doesn't do it?
I have (suse 8.2): title memtest86 kernel (hd1,1)/boot/memtest.bin -- Cheers, Carlos Robinson
The Friday 2004-01-30 at 18:32 +1100, Michael James wrote:
How can I get back the old behaviour an fsck happening during reboot every x reboots or y days?
I think that still happens, but it is very fast. If there is something wrong on the root partition, and it is a reiser, you have to check from a boot CD.
Or, how can I trigger an "fsck reboot"?
touch /forcefsck
PS: I've just realized I can do it by adding an fsck into the linuxrc script of a cooked initrd image. That would give me an "fsck boot" option in grub.
I prefer having a second system on another partition. It serves many rescue problems. -- Cheers, Carlos Robinson
participants (4)
-
Alex Angerhofer
-
Carlos E. R.
-
Michael James
-
yep@osterbo-net.dk