Reiserfs corrupting on IDE HD with 9.3 (and not 9.2)
Hi All, I've got a Reiserfs file system corruption problem happening here with 9.3 :-( I know it's related to 9.3 since 9.2 doesn't skip a beat on this hardware. To make sure, I reinstalled 9.2 and ran it again for three days without a symptom, However, after installing 9.3 (this is my 3rd time) within an hour of running YOU I start noticing little 'ticks' in the file system like this one I just found in /usr/bin: drwxr-xr-x 3 root root 77872 2005-08-03 20:58 ./ drwxr-xr-x 12 root root 344 2005-08-03 17:08 ../ -rwxr-xr-x 1 root root 24916 2005-03-19 15:28 [* -rwxr-xr-x 1 root root 8818 2005-03-22 06:12 3Ddiag* <snip> If it follows the same pattern this time around, the system won't be bootable in another hour or two and I'll need to reinstall 9.3 again. I'd appreciate some more enlightened ideas about the best approach for narrowing down the possibilities. TIA & regards, - Carl --
On Wednesday 03 August 2005 10:02 pm, Carl E. Hartung - SuSE Mail List Account wrote:
If it follows the same pattern this time around, the system won't be bootable in another hour or two and I'll need to reinstall 9.3 again. I'd appreciate some more enlightened ideas about the best approach for narrowing down the possibilities.
Carl, My answer was to get rid of the reiserfs and go to xfs, jfs or ext3. Since doing that I have no more problems. On my various systems which include a dell inspiron laptop, compaq laptop and 9 desktop type boxes with different mobos and drive types, my partition corruption has ceased. While I'm not smart enough to figure out the cause of the problem the solution is simple and it works. Richard
Richard Atcheson wrote:
On Wednesday 03 August 2005 10:02 pm, Carl E. Hartung - SuSE Mail List Account wrote:
If it follows the same pattern this time around, the system won't be bootable in another hour or two and I'll need to reinstall 9.3 again. I'd appreciate some more enlightened ideas about the best approach for narrowing down the possibilities.
Carl, My answer was to get rid of the reiserfs and go to xfs, jfs or ext3. Since doing that I have no more problems. On my various systems which include a dell inspiron laptop, compaq laptop and 9 desktop type boxes with different mobos and drive types, my partition corruption has ceased.
While I'm not smart enough to figure out the cause of the problem the solution is simple and it works. Richard
Perhaps because of the majority of systems are using reiserfs, it gets this sort of kneejerk response - proved false by a later update post. I have had several "reiserfs" failures over many years, 100% of them have been caused by bad hardware - hard drives, IDE controllers, CPU's and memory, the latest was 3 weeks ago on my x86_64 laptop with a broken HD, the same problem seen both with reiserfs and ext3. Running machines here 24x7x365 1/4, the available IDE drives and other hardware WILL FAIL sooner or later as the collected piles here will testify. Reiserfs in use with full confidence on 2x SuSE 9.3 x86, 1x SuSE 9.3 x86_64, 2x Mandriva LE 2005 and 1x gentoo x86, all boxes using the latest kernel.org kernels. Regards Sid. -- Sid Boyce ... Hamradio License G3VBV, Keen licensed Private Pilot Retired IBM Mainframes and Sun Servers Tech Support Specialist Microsoft Windows Free Zone - Linux used for all Computing Tasks
On Thursday 04 August 2005 04:02, Carl E. Hartung - SuSE Mail List Account wrote:
Hi All,
I've got a Reiserfs file system corruption problem happening here with 9.3 :-( I know it's related to 9.3 since 9.2 doesn't skip a beat on this hardware. To make sure, I reinstalled 9.2 and ran it again for three days without a symptom, However, after installing 9.3 (this is my 3rd time) within an hour of running YOU I start noticing little 'ticks' in the file system like this one I just found in /usr/bin:
drwxr-xr-x 3 root root 77872 2005-08-03 20:58 ./ drwxr-xr-x 12 root root 344 2005-08-03 17:08 ../ -rwxr-xr-x 1 root root 24916 2005-03-19 15:28 [* -rwxr-xr-x 1 root root 8818 2005-03-22 06:12 3Ddiag* <snip>
If it follows the same pattern this time around, the system won't be bootable in another hour or two and I'll need to reinstall 9.3 again. I'd appreciate some more enlightened ideas about the best approach for narrowing down the possibilities.
TIA & regards,
- Carl
--
Hi Carl Check that hard drive , I know you say it is ok on 9.2 and frells on 9.3 but my first place to look would be hardware having had Reiserfs problems they all related to dying hard drives that seem ok then start giving errors . Pete . -- If Bill Gates had gotten LAID at High School do YOU think there would be a Microsoft ? Of course NOT ! You gotta spend a lot of time at your school Locker stuffing underware up your ass to think , I am going to take on the worlds Computer Industry -------:heard on Cyber Radio.:-------
Hi All,
I've got a Reiserfs file system corruption problem happening here with 9.3 :-( I know it's related to 9.3 since 9.2 doesn't skip a beat on this hardware. To make sure, I reinstalled 9.2 and ran it again for three days without a symptom, However, after installing 9.3 (this is my 3rd time) within an hour of running YOU I start noticing little 'ticks' in the file system like this one I just found in /usr/bin:
drwxr-xr-x 3 root root 77872 2005-08-03 20:58 ./ drwxr-xr-x 12 root root 344 2005-08-03 17:08 ../ -rwxr-xr-x 1 root root 24916 2005-03-19 15:28 [* -rwxr-xr-x 1 root root 8818 2005-03-22 06:12 3Ddiag* <snip>
If it follows the same pattern this time around, the system won't be bootable in another hour or two and I'll need to reinstall 9.3 again. I'd appreciate some more enlightened ideas about the best approach for narrowing down the possibilities. Carl, Like others who have replied, I have 9.3 installed on 2 systems (home and work) with no problems and I routinely run YOU as necessary. I don't think
On Wednesday 03 August 2005 11:02 pm, Carl E. Hartung - SuSE Mail List Account wrote: that Reiserfs is your problem, but I have seen similar problems in a system with a bad memory module. What I would do on your system is first run a memory diagnostic (from the CD or DVD). Others have suggested your hard drive, but I am not sure of that since 9.2 should have some issues also. After you run the diags, then possibly reinstall 9.3 using ext3 (or another file system). This is the same entries in my /usr/bin drwxr-xr-x 2 root root 62968 2005-07-28 16:14 . drwxr-xr-x 12 root root 344 2005-05-31 14:01 .. -rwxr-xr-x 1 root root 24916 2005-03-19 15:28 [ -rwxr-xr-x 1 root root 8818 2005-03-22 06:12 3Ddiag -- Jerry Feldman <gaf@blu.org> Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9
On Wed, 2005-08-03 at 22:02 -0500, Carl E. Hartung - SuSE Mail List Account wrote:
Hi All,
I've got a Reiserfs file system corruption problem happening here with 9.3 :-( I know it's related to 9.3 since 9.2 doesn't skip a beat on this hardware. To make sure, I reinstalled 9.2 and ran it again for three days without a symptom, However, after installing 9.3 (this is my 3rd time) within an hour of running YOU I start noticing little 'ticks' in the file system like this one I just found in /usr/bin:
drwxr-xr-x 3 root root 77872 2005-08-03 20:58 ./ drwxr-xr-x 12 root root 344 2005-08-03 17:08 ../ -rwxr-xr-x 1 root root 24916 2005-03-19 15:28 [* -rwxr-xr-x 1 root root 8818 2005-03-22 06:12 3Ddiag* <snip>
'/usr/bin/[' is not a tick, it is a synonym for the 'test' program (or shell builtin). -K
[Kelly Burkhart]
On Wed, 2005-08-03 at 22:02 -0500, Carl E. Hartung - SuSE Mail List Account wrote:
[...] I start noticing little 'ticks' in the file system like this one I just found in /usr/bin:
-rwxr-xr-x 1 root root 24916 2005-03-19 15:28 [*
'/usr/bin/[' is not a tick, it is a synonym for the 'test' program (or shell builtin).
This one is OK, of course. But your message implies that you find many, among which this one. I'm a bit curious about the others. -- François Pinard http://pinard.progiciels-bpi.ca
On Wed, 2005-08-03 at 22:02 -0500, Carl E. Hartung - SuSE Mail List
Account wrote:
Hi All,
I've got a Reiserfs file system corruption problem happening here with 9.3 :-( I know it's related to 9.3 since 9.2 doesn't skip a beat on this hardware. To make sure, I reinstalled 9.2 and ran it again for three days without a symptom, However, after installing 9.3 (this is my 3rd time) within an hour of running YOU I start noticing little 'ticks' in the file system like this one I just found in /usr/bin:
drwxr-xr-x 3 root root 77872 2005-08-03 20:58 ./ drwxr-xr-x 12 root root 344 2005-08-03 17:08 ../ -rwxr-xr-x 1 root root 24916 2005-03-19 15:28 [* -rwxr-xr-x 1 root root 8818 2005-03-22 06:12 3Ddiag* <snip>
'/usr/bin/[' is not a tick, it is a synonym for the 'test' program (or shell builtin). Historically, /usr/bin/test and /usr/bin/[ were the same code hard linked together. (Or [ was symlinked to test). These are standard Unix/Linux commands, not shell builtins. The shells also have these builtin. However, I have found that on SuSE 9.3 and SLES9 and RHEL 4
On Thursday 04 August 2005 9:29 am, Kelly Burkhart wrote: that /usr/bin/[ and /usr/bin/test are now separately compiled. -- Jerry Feldman <gaf@blu.org> Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9
Carl, I also had ReiserFS corruption problems with 9.3 that I had not been having with 9.2 (both the x86_64 versions). It caused all sorts of strange problems for me while I was installing 9.3 from ftp. I cannot say what the cause or the fix for this problem was, but I have not seen these same problems after I updated the kernel with YOU. I really should boot with a Rescue CD and fsck everything to make sure there are no new corruptions, but I have not noticed any "mysterious problems" since the kernel update. If you have not done that yet, I think it would be worth trying. "Carl E. Hartung - SuSE Mail List Account" <suselinux@cehartung.com> wrote: Hi All, I've got a Reiserfs file system corruption problem happening here with 9.3 :-( I know it's related to 9.3 since 9.2 doesn't skip a beat on this hardware. To make sure, I reinstalled 9.2 and ran it again for three days without a symptom, However, after installing 9.3 (this is my 3rd time) within an hour of running YOU I start noticing little 'ticks' in the file system like this one I just found in /usr/bin: drwxr-xr-x 3 root root 77872 2005-08-03 20:58 ./ drwxr-xr-x 12 root root 344 2005-08-03 17:08 ../ -rwxr-xr-x 1 root root 24916 2005-03-19 15:28 [* -rwxr-xr-x 1 root root 8818 2005-03-22 06:12 3Ddiag* If it follows the same pattern this time around, the system won't be bootable in another hour or two and I'll need to reinstall 9.3 again. I'd appreciate some more enlightened ideas about the best approach for narrowing down the possibilities. TIA & regards, - Carl
Hi All, Sorry for the delay in closing this thread out and many thanks to everyone for their kind and helpful responses. I /did/ finally resolve this and am happily running 9.3 now. However, I don't understand the exact sequence of events or the precise cause of my difficulties. There were several contributing factors: - crappy IDE cables not able to support above UDMA33 although the host controller and drives are UDMA100. I don't know if the installer is sophisticated enough to recognize and adjust for potential problems in this area. - the host controller is reported during boot as "not 100% native mode" and is also causing barrier-based syncs to fail, which I understand is not good when you're running file systems like ext3 and Reiserfs. Also, it seems like the module used for this controller has problems handling adjacent ext3 and Reiserfs partitions on the same physical drive (see my closing notes) - I wasn't having any file system problems, so I didn't 'wipe' the 9.2 installation off the drive and verify the partitions before installing 9.3. As it turned, out the adjacent ext3 partition was corrupted /first/, leading to read/write/seek errors and retries, which I experienced as "frozen" windows. In fact, every program attached to the entire /drive/ was waiting for it to regain it's senses. That, I think, is how I ultimately corrupted an otherwise healthy Reiserfs file system: by repeatedly force-closing properly waiting programs. - I decided to change my partitioning scheme *after* the first failed install and (shiver) used my paid-for Partition Magic on that other OS to implement it instead of booting to rescue mode and using Linux tools. (I know... lazy....) Specifically, I used PM to delete '/' (Reiserfs) and '/home' (ext3) and divide the space differently. Then, during installation, I selected to format '/' as Reiserfs but only *mounted* the PM-created ext3 partition as '/home'. Maybe PM didn't prepare the ext3 partition correctly in the first place? - finally, towards the end of the next-to-last installation attempt, as SuSE was very busy building and writing configuration files, the power company shut off power to fix equipment damaged the previous night by a lightening strike (sigh). In my then stubbled-chin, weary-eyed and very cranky state, I threw caution to the wind and said "screw this!... delete everything on the %#@@!$#%#@ drive and give me Reiserfs all the way, baby!" Guess what? No more barrier-based sync failures being reported against the ext3 partition (it doesn't exist anymore), no more "frozen" windows with permanently lit HD indicators and 9.3 seems to be good to go. So, there you have it. If you think you can figure out from this mess exactly why the installation went as badly as it did, I'm open to theories. Meanwhile, it's on to other issues. And thanks again for all of your replies. Regards, - Carl
Carl Hartung wrote:
Hi All,
Sorry for the delay in closing this thread out and many thanks to everyone for their kind and helpful responses. I /did/ finally resolve this and am happily running 9.3 now. However, I don't understand the exact sequence of events or the precise cause of my difficulties. There were several contributing factors:
- crappy IDE cables not able to support above UDMA33 although the host controller and drives are UDMA100. I don't know if the installer is sophisticated enough to recognize and adjust for potential problems in this area.
- the host controller is reported during boot as "not 100% native mode" and is also causing barrier-based syncs to fail, which I understand is not good when you're running file systems like ext3 and Reiserfs. Also, it seems like the module used for this controller has problems handling adjacent ext3 and Reiserfs partitions on the same physical drive (see my closing notes)
- I wasn't having any file system problems, so I didn't 'wipe' the 9.2 installation off the drive and verify the partitions before installing 9.3. As it turned, out the adjacent ext3 partition was corrupted /first/, leading to read/write/seek errors and retries, which I experienced as "frozen" windows. In fact, every program attached to the entire /drive/ was waiting for it to regain it's senses. That, I think, is how I ultimately corrupted an otherwise healthy Reiserfs file system: by repeatedly force-closing properly waiting programs.
- I decided to change my partitioning scheme *after* the first failed install and (shiver) used my paid-for Partition Magic on that other OS to implement it instead of booting to rescue mode and using Linux tools. (I know... lazy....) Specifically, I used PM to delete '/' (Reiserfs) and '/home' (ext3) and divide the space differently. Then, during installation, I selected to format '/' as Reiserfs but only *mounted* the PM-created ext3 partition as '/home'. Maybe PM didn't prepare the ext3 partition correctly in the first place?
- finally, towards the end of the next-to-last installation attempt, as SuSE was very busy building and writing configuration files, the power company shut off power to fix equipment damaged the previous night by a lightening strike (sigh). In my then stubbled-chin, weary-eyed and very cranky state, I threw caution to the wind and said "screw this!... delete everything on the %#@@!$#%#@ drive and give me Reiserfs all the way, baby!"
Guess what? No more barrier-based sync failures being reported against the ext3 partition (it doesn't exist anymore), no more "frozen" windows with permanently lit HD indicators and 9.3 seems to be good to go.
So, there you have it. If you think you can figure out from this mess exactly why the installation went as badly as it did, I'm open to theories. Meanwhile, it's on to other issues. And thanks again for all of your replies.
Regards,
- Carl
No explanations possible at this early hour of the morning, but I knew it had nothing to do with reiserfs as I stated, it's a solid performer. Regards Sid. -- Sid Boyce ... Hamradio License G3VBV, Keen licensed Private Pilot Retired IBM/Amdahl Mainframes and Sun/Fujitsu Servers Tech Support Specialist Microsoft Windows Free Zone - Linux used for all Computing Tasks
participants (9)
-
Carl E. Hartung - SuSE Mail List Account
-
Carl Hartung
-
François Pinard
-
Jerry Feldman
-
Kelly Burkhart
-
Peter Nikolic
-
Richard Atcheson
-
Sid Boyce
-
Wendell Sexson