On 3/13/2009 at 15:50, Jeff Mahoney
Please file a bug, but note that our reiserfs person
is on vacation for
a while, so it will be a bit before he can get to stuff like this.
In the mean time, please try the Kernel-of-the-day for SLE11, it has
some reiserfs fixes in it that missed the last 11.1 update kernel.
This isn't a reiserfs bug. This is reiserfs correctly handling a journal
write failure. The log looks like the disk went out to lunch and then
was reset, dropping existing requests on the floor and returning I/O
errors. This is typically bad hardware, but Dominique followed up saying
that Red Hat is tracking a bug in the marv driver.
Indeed, it looked lik a hardware failure... but having two disks (ok: on the same
controller, replaced one samsung 250GB with another) being broken
sounded a bit awkwards.
Just a short recap:
The system was running fine for a long time on OSS 10.2 (two disks, one data one system).
Installed 11.1 on the system disk (dropped 10.2)... formatted the disk with ext3. System
changed FS to r/o frequently... two days up was maximum.
Re-Installed 11.1, used reiserfs instead of EXT (ext linked all the stuff to lost+found..
not that I would have liked that.. so another try, another
Same behaviour: system goes to R/O of the root FS once in a while (2 days seemed still
max). /var is on another patition so I can get some usable
logs (opposed to the previous install).
Swapped hard disks.. used the previous data disk, re-installed 11.1 on it... with
reiserfs... no change at all.. so either the controller is broken
or the OS.
A lot of reading, I find similiar issues reported, but only from people running advanced
raid systems (the marv mentioned earlier). Not the case
here. I don't run raid (only lvm.. but I have two disks in the system only).
On Monday, 9.3.2009 I installed the Kernel:HEAD on this bo (2.6.29-rc7). The machine has
now an uptime of 3 days 20 hours. looks like a new best
while running on openSUSE 11.1 for this machine.
I also gave it some load, like rebuilding some packages (it's an OBS instance after
all)... still, the FS seems to do just fine.
The most 'scary' messages in dmesg so far would be:
JBD: barrie-based snync failed on dm-3 - disabling
otherwise I don't see anything special in the dmesg output (after almost 4 days of
So whatever it was in the openSUSE stock kernel seems to no longer be a problem in the
2.6.29 kernel. Not very helpful knowing that the kernel
shipped is possibly guilty for some crashes and not being able to tell exactly why it is,
To unsubscribe, e-mail: opensuse-kernel+unsubscribe(a)opensuse.org
For additional commands, e-mail: opensuse-kernel+help(a)opensuse.org