Mailinglist Archive: opensuse (3378 mails)

< Previous Next >
Serious problems with recent SuSE kernels (under 7.3) and thoughts on reiserfs in combination with quota
  • From: Eric Maryniak <e.maryniak@xxxxxxxxx>
  • Date: Thu, 18 Apr 2002 15:02:47 +0200
  • Message-id: <20020418150247.A25491@xxxxxxxxxx>
Hello fellow suse geeko's,

Although I'm grosso modo quite pleased with SuSE, after
jumping from 6.3 to 7.3 two months ago, a warning on their
kernels is in place imho.
If you are like me managing a heavily loaded server (www,
mysql and internal use for simulations, too, budgetly forced)
and are therefore semi-daring, i.e. you don't want to use the
latest bleeding edge kernel, but do follow SuSE's new kernels
in
ftp://ftp.gwdg.de/linux/suse/7.3_update/kernel

to feel 'safe' that your system is optimally performing and
somewhat hacker-proof (kernel-wise), then a word of warning on
kernel ftp://ftp.gwdg.de/linux/suse/7.3_update/kernel/2.4.16-20020325
is in order. Indeed SuSE now mentions "DONTUSE.README" and yes
I know it is not in .../certfied_by_suse.
However with the certified kernel 2.4.10 my SMP (#2) server
it did not start at all and I had to use 2.2.19 when I upgraded
to 2.4.16.20011220 (succesfully). With the initial install with
2.4.10 I kept getting "yast_inst_setup @ line 137" errors...

The problem with 2.4.16-20020325 was quite serious though: after
the initial install with (note: modutils and reiserfs are included
in recent kernels but not in 2.4.16-20011220):

cp -pv /boot/vmlinuz /boot/vmlinuz.old
cp -pv /boot/initrd /boot/initrd.old

rpm -Uvh --nodeps --force \ (NOTE: BAD KERNEL!)
k_smp-2.4.16-37.i386.rpm \
kernel-source-2.4.16.SuSE-31.i386.rpm

mk_initrd
lilo (with label to vmlinuz.old and vmlinuz)

the system ran for a couple of days, then crashed: an 'Aiee'
kernel dump and /var/log/ files on reiserfs that had lots of
^@ null bytes :-((
Ok I thought, can happen (quantum singularity? solar flames?
perhaps I should install those Heisenberg compensators after all).
But then it happened again and moreover: I could not reboot the
machine: it kept hanging at rcquota.
I managed to boot with the Rescue CD (#1) and by mounting
the reiserfs volumes manually and installing a new kernel.
(See below for my probably non-optimal howto/tip).

Right now, I'm running 2.4.16-20020416 both on my desktop
and on the server. Succesfully so far. But:
Another thing when rebooting a machine with reiserfs AND quota:
when you had a crash, reiserfs is not much use because although
the log replay goes fast, quota takes veeeery long. Imho, quota
should be more integrated into reiserfs, but maybe this is not
possible technically (well, I think it is, but it's probably
very difficult). Cuz now, the benefits of reiserfs (w/r to fast
reboots) are lost due to the long quota check on disks with lots
of files.
Also, reiserfs managed to really screw up some /var/log/ files,
namely messages and warn: messages had some segments with ^@^@
(NULL bytes) AND parts of my Apache /var/log/httpd/access_log!!
Of course reiserfs cannot be blamed for non-closure in case of
a kernel panic, but corrupting it with parts of other files is
more serious imo. I'm also concerned about MySQL's files (but
rely on the fact (?) that MySQL does consistency checking of
it's files when loading).

My points/suggestions:

o SuSE should put out more clear READMEs/Announcements on
the what and why of kernels in x.y_update/kernel ftp dirs.
A short summary of fixed bugs and features and intended users
would also be nice.
o reiserfs should have quota support incorporated better so
that a quotacheck after a crash is not necessary, i.e. implied
by reiserfs's transaction log replay (I guess this means
writing out quota (and perhaps other) file meta-info in the
transactions logs -- perhaps make this a settable feature
when mkreiserfs'ing the filesystem. I don't mind giving up
some performance for a fastly rebooting server with ok fs.
o reiserfs should not corrupt file's contents with those from
other files after a crash (having trailing ^@ nulls is ok).

Some tips/howto's I found out when upgrading from 6.3 to 7.3:

o Recovering an unbootable system and installing a new kernel

1. Boot with the Rescue kernel (CD #1), make your BIOS CD-ROM
bootable first if necessary.

2. Mount your systems main filesystem (root) on /mnt, like:

mount -t reiserfs /dev/sda3 /mnt

3. Make /mnt your root point in one console window:

chroot /mnt

In this chroot-ed env, install a new kernel with the 'cp', 'rpm'
and 'mk_initrd' mentioned command above, BUT DO NOT RUN lilo YET!

4. In another window, _not_ chroot-ed, mv the Rescue kernel's /boot
and create a new boot mount point:

mv /boot /boot-rescue
mkdir /boot

Mount your system's boot filesystem (usually ext2) on /boot:

mount -t ext2 /dev/sda1 /boot

and move the just created files from the chroot-ed /boot to this
'real' boot point:

mv -v /mnt/boot/* /boot/.

Now run lilo, using your system's lilo.conf after editing it
such as adding a label for the previous kernel (which in the
case described in this mail is a malfunct kernel, so vmlinuz.old
should be the last ok working kernel):

lilo -C /mnt/etc/lilo.conf

Lilo will have updated /boot (in the non-chroot-ed env).
Reboot now.

o Fixing up a reiserfs to support >2 Gb files
The trick is to convert it from 7.3's 3.5 version to 3.6.

1. Re-mount it with:

mount -t reiserfs -o conv /dev/sda3 /mnt

If it is your root filesystem, boot with the Rescue kernel
(see previous tip).

2. Make it with:

mkreiserfs -h r5 -v 3.6 /dev/sdb1

o Adding quota support to reiserfs (SuSE 7.3)
Important: before doing anything, be sure to install (rpm -Uvh)
the necessary fixes first!! Such as: glibc.rpm, glibc-*.rpm,
quota.rpm (!!!!), shadow.rpm etc. Use YOU or YaST or plain rpm
on downloaded .rpm's from SuSE's Security page. Then:

. Verify START_QUOTA="yes" in /etc/rc.config and that you have booted with
a quota enabled kernel:

grep -i quota /var/log/boot.msg

<5>VFS: Diskquotas version dquot_6.5.0 initialized

. Edit /etc/fstab:

/dev/sda3 / reiserfs defaults,usrquota,grpquota 1 1
/dev/sdb1 /disk2 reiserfs defaults,usrquota,grpquota 1 2

If /etc/fstab is edited, a reboot is necessary, because the filesystem(s)
must be mounted with the quota options.

. Initialize quota support (first time only, single user mode necessary).
Be sure that quota version 3.02 (rpm -q quota: quota-3.02-38) or higher
is installed:

init 1
quotacheck -avug

Reboot. If START_QUOTA="yes" in /etc/rc.config is set, then quota should
be activated and you should _not_ have to set quota on with:

quotaon -avug (should not be necessary after a reboot)

Verify quota settings:

repquota -avug

That's it -- maybe these tips are of help to people (use at your own risk,
though ;-) and if I'm in error somewhere, please send corrections to this
list.

Bye-bye,

Eric
--
Eric Maryniak <e.maryniak@xxxxxxxxx>
WWW homepage: http://pobox.com/~e.maryniak/
Mobile phone: +31 6 52047532, or (06) 520 475 32 in NL.

How does a Heisenberg compensator work?
Oh, very well, thank you.

< Previous Next >