Marcus Meissner wrote
So far we have not heard about it.
Might be related to root-over-nfs because I can reliably reproduce it by rebooting our root-over-nfs clients to crash the nfs server. Rebooting other clients that just mount stuff like /home etc. do not crash the server. Maybe it's some special kind of locking or sth. that occurs when having things like /var over NFS...
"2.6.16-10suse-bio-smp", you even recompiled your kernel, right?
Yes, that because we use the same kernel for server and diskless clients and compile network drivers into the kernel, so that the diskless clients can do root-over-nfs (I guess it should work with a initrd, too, but we didn't try so far).
Does this happen when you do "rcnfsserver stop"?
No, just after booting a client this happens. Not always, I rebooted and rebooted three clients in parallel, and after about 2-3 reboots, the servers crashes. Now this happens also when server and clients are running the .54 kernel, so it's not just an upgrade issue from .53 to .54. I couldn't get further logs because the crashes are now so hard that no log is written anymore. I'm trying to setup syslog via ttyS0 to fetch them. I that might help in the meantime, here is the output from ksymoops for the first crash with the fglrx-nvidia-tainted kernel. Maybe someone can get some information from this. cu, Frank ksymoops 2.4.11 on x86_64 2.6.16-10suse-bio-smp. Options used -V (default) -k /proc/kallsyms (default) -l /proc/modules (default) -o /lib/modules/2.6.16-10suse-bio-smp/ (default) -m /boot/System.map-2.6.16-10suse-bio-smp (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Warning (read_ksyms): no kernel symbols in ksyms, is /proc/kallsyms a valid ksyms file? No modules in ksyms, skipping objects No ksyms, skipping lsmod Dec 4 14:26:35 backus kernel: CPU 0 Dec 4 14:26:35 backus kernel: Pid: 6403, comm: lockd Tainted: G U 2.6.16-10suse-bio-smp #1 Dec 4 14:26:35 backus kernel: RIP: 0010:[<ffffffff8019414e>] <ffffffff8019414e>{__locks_delete_block+43} Using defaults from ksymoops -t elf64-x86-64 -a i386:x86-64 Dec 4 14:26:35 backus kernel: RSP: 0018:ffff810116fdbe00 EFLAGS: 00010286 Dec 4 14:26:35 backus kernel: RAX: ffff81013cfab508 RBX: ffff81013cfab500 RCX: ffffffff80421780 Dec 4 14:26:35 backus kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff81013cfab500 Dec 4 14:26:35 backus kernel: RBP: ffff81011bfe1810 R08: ffff810116fbb054 R09: 0000000000000005 Dec 4 14:26:35 backus kernel: R10: 0000000000000000 R11: ffff81011a893800 R12: ffff810149d5f860 Dec 4 14:26:35 backus kernel: R13: 0000000000000001 R14: ffff8101168a54f0 R15: 0000000000000000 Dec 4 14:26:35 backus kernel: FS: 00002b65f94dc6d0(0000) GS:ffffffff804a9000(0000) knlGS:00000000eb99fba0 Dec 4 14:26:35 backus kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Dec 4 14:26:35 backus kernel: CR2: 0000000000000000 CR3: 000000021b5e0000 CR4: 00000000000006e0 Dec 4 14:26:35 backus kernel: Stack: ffffffff801947c4 ffff810149d5f860 ffff81011bfe1810 ffff8101199a08d0 Dec 4 14:26:35 backus kernel: ffffffff8019492b ffff81011bfe1810 ffffffff80194d46 ffff81011bfe1310 Dec 4 14:26:35 backus kernel: ffff8101199a0828 ffff8100d24093c0 Dec 4 14:26:35 backus kernel: Call Trace: <ffffffff801947c4>{locks_wake_up_blocks+27} Dec 4 14:26:35 backus kernel: <ffffffff8019492b>{locks_delete_lock+126} <ffffffff80194d46>{__posix_lock_file+398} Dec 4 14:26:35 backus kernel: <ffffffff80233f4f>{nlmsvc_unlock+125} <ffffffff802382c5>{nlm4svc_proc_unlock+111} Dec 4 14:26:35 backus kernel: <ffffffff8037298c>{svc_process+838} <ffffffff8023365a>{lockd+0} Dec 4 14:26:35 backus kernel: <ffffffff802337f9>{lockd+415} <ffffffff8010bb8a>{child_rip+8} Dec 4 14:26:35 backus kernel: <ffffffff8023365a>{lockd+0} <ffffffff8023365a>{lockd+0} Dec 4 14:26:35 backus kernel: <ffffffff8010bb82>{child_rip+0} Dec 4 14:26:35 backus kernel: Code: 48 89 0a 48 89 40 08 48 89 47 08 48 c7 07 00 00 00 00 c3 48
RIP; ffffffff8019414e <__locks_delete_block+2b/3e> <=====
RAX; ffff81013cfab508
RBX; ffff81013cfab500 RCX; ffffffff80421780 RDI; ffff81013cfab500 RBP; ffff81011bfe1810 R08; ffff810116fbb054 R11; ffff81011a893800 R12; ffff810149d5f860 R14; ffff8101168a54f0
Trace; ffffffff801947c4