Re: [opensuse-kernel] 2.6.24.1-35.1 udev crashes in initrd

3 Mar 2008

      Am Montag, 25. Februar 2008 schrieb Hans-Peter Jansen:
...
Since the users also complain about 1-10 second hangs in a terminal based
order management system, I thought, it would be a good idea to try to
move the kernel to 2.6.24.1 with all the fancy (IO) scheduling and
engaging BKL, etc.. (I just rpmbuild
Kernel:/HEAD/openSUSE_Factory/kernel-default-2.6.24.1-35.1 on that
system).
The first tries consistently resulted in Oops during initrd, similar to:
BUG: unable to handle kernel NULL pointer dereference at virtual address
00000040 printing eip: c011f96c *pde = 00000000
Oops: 0000 [#1] SMP
last sysfs file: /block/sde/sde1/dev
Modules linked in: sata_sil24 libata 3w_9xxx sd_mod scsi_mod
Pid: 538, comm: udev Not tainted (2.6.24.1-35.1-default #1)
EIP: 0060:[<c011f96c>] EFLAGS: 00010046 CPU: 0
EIP is at pick_next_task_fair+0x15/0x23
EAX: 00000000 EBX: f75d11f0 ECX: c202cdd0 EDX: 00000000
ESI: 00000000 EDI: 00000001 EBP: f7481f08 ESP: f7481f08
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process udev (pid: 538, ti=f7480000 task=f75d11f0 task.ti=f7480000)
Stack: f7481f2c c02eaeea c202a1cc 00000000 c202cd80 f75d1358 f7481f44
00000001 00000001 f7481f9c c02eb95e 00000001 00000000 00000000 c013af1e
f7ffc944 00000000 00000000 73e8b1d5 00000005 c013aeba c202a1cc 00000001
c02eb953 Call Trace:
 [<c02eaeea>] __sched_text_start+0x11a/0x379
 [<c02eb95e>] do_nanosleep+0x3c/0x67
 =======================
Code: 39 c8 73 0c 5e 89 f8 5b 5e 5f 5d e9 9b f6 ff ff 5b 5b 5e 5f 5d c3
55 83 c0 34 31 d2 83 78 08 00 89 e5 74 11 e8 15
fe ff ff 89 c2 <8b> 40 40 85 c0 75 f2 83 ea 30 5d 89 d0 c3 55 89 e5 53 89
d3 83 EIP: [<c011f96c>] pick_next_task_fair+0x15/0x23 SS:ESP
0068:f7481f08 ---[ end trace 18a67066b954c85e ]---
As it stands, it crashes consistently in pick_next_task_fair, even with a
initrd within contraints..
Today I noticed Gregs 2.6.24.3 announcement, with two hrtimer related
fixes, with could fit the picture.
I do confirm, that 2.6.24.3 overcomes the reported initrd problem (even 
using the native mkinitrd/udev setup from 9.3). With a few kernel config 
touches (NO_HZ disabled, switched to HZ_1000) and few nfs related package 
rebuilds from factory, it's finally running in production now. 

If it survives today, I feel much better. Hopefully the latency problems 
dimished, but I reported some not so funny looking numbers from a different 
setup to LKML, gathered with latencytop (which hopefully didn't induced 
some Heisenberg uncertainty relation problem).

Should I report such problems here, too, given, that 2.6.24.3 isn't easily 
available for SUSE setups ATM? 

OTOH, I noticed still some problems with that kernel on openSUSE 10.2 
(probably related to inet interface renaming?) - which resulted in:
 - a hard freeze immediately after initing lo in RL 3 (forces manual reset)
 - failing rename, which left the major device with some obscure ethxx(?)
   name

Pete
-- 
To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse-kernel+help@opensuse.org