[Bug 399987] New: lockd confused on nfs client when installing onto nfsroot
https://bugzilla.novell.com/show_bug.cgi?id=399987 Summary: lockd confused on nfs client when installing onto nfsroot Product: openSUSE 11.0 Version: Final Platform: PowerPC OS/Version: Linux Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: olh@novell.com QAContact: qa@suse.de Found By: --- I tried to install 11.0 snapshots onto nfsroot several times in the last weeks. While this works ok on blades and other clients, it does not work so well on an ibook G4 if the server is my G5 workstation. blackberry has this in /etc/exports: /tftpboot 10.10.0.0/255.255.0.0(rw,async,insecure,anonuid=65534,anongid=65534,no_root_squash) The inst-sys uses these mount options: inst-sys:~ # cat /proc/mounts rootfs / rootfs rw 0 0 tmpfs / tmpfs rw,size=0k,nr_inodes=0 0 0 tmpfs / tmpfs rw,size=0k,nr_inodes=0 0 0 proc /proc proc rw 0 0 sysfs /sys sysfs rw 0 0 /dev/loop0 /mounts/mp_0000 squashfs ro 0 0 /dev/loop1 /mounts/mp_0001 squashfs ro 0 0 /dev/loop2 /mounts/mp_0002 squashfs ro 0 0 /dev/loop3 /mounts/mp_0003 squashfs ro 0 0 /dev/loop1 /usr/bin/gdb squashfs ro 0 0 devpts /dev/pts devpts rw,mode=600 0 0 10.10.1.184:/nfsroot/coconut /mnt nfs rw,vers=2,rsize=8192,wsize=8192,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=10.10.1.184 0 0 tmpfs /mnt/dev tmpfs rw,size=0k,nr_inodes=0 0 0 proc /mnt/proc proc rw 0 0 sysfs /mnt/sys sysfs rw 0 0 inst-sys:~ # but after some time, when the install images are extracted, I get this in dmesg on the client: lockd: server 10.10.1.184 not responding, still trying I can still ssh from the ibook to the G5. I can still cat files from the nfs mount. But yast hangs: --- Exception: c01 at 0xf8f9c50 LR = 0x1004d134 y2base D 0fc26744 0 2709 2084 Call Trace: [e6343b70] [00000001] 0x1 (unreliable) [e6343c30] [c0009278] __switch_to+0x78/0x90 [e6343c50] [c033a588] schedule+0x3c8/0x404 [e6343ca0] [ea204810] rpc_wait_bit_killable+0x44/0x4c [sunrpc] [e6343cb0] [c033ab18] __wait_on_bit+0x68/0xc0 [e6343cd0] [c033ac34] out_of_line_wait_on_bit+0xc4/0xd8 [e6343d20] [ea205030] __rpc_execute+0x14c/0x2b4 [sunrpc] [e6343d60] [ea1fe834] rpc_run_task+0x64/0x7c [sunrpc] [e6343d70] [ea1fe9c0] rpc_call_sync+0x5c/0x8c [sunrpc] [e6343db0] [ea1c18ac] nlmclnt_call+0xcc/0x288 [lockd] [e6343e20] [ea1c2124] nlmclnt_proc+0x2d4/0x5f0 [lockd] [e6343e60] [ea298548] nfs_proc_lock+0x24/0x34 [nfs] [e6343e70] [ea290c50] do_setlk+0x68/0xcc [nfs] [e6343e90] [c00b1dfc] fcntl_setlk64+0x1ac/0x324 [e6343f20] [c00adb44] sys_fcntl64+0x80/0xbc [e6343f40] [c001351c] ret_from_syscall+0x0/0x40 The server was openSuSE 10.3 or 11.0. The client was booted with these options from the 11.0 DVD: quiet sysrq=1 netsetup=1 hostip=10.10.2.36/16 gateway=10.10.0.8 nameserver=10.10.2.88 start_shell -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987
Lars Marowsky-Bree
https://bugzilla.novell.com/show_bug.cgi?id=399987
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c1
Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=399987
Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=399987
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c2
--- Comment #2 from Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=399987
User olh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c3
--- Comment #3 from Olaf Hering
https://bugzilla.novell.com/show_bug.cgi?id=399987
User olh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c4
Olaf Hering
https://bugzilla.novell.com/show_bug.cgi?id=399987
Olaf Hering
https://bugzilla.novell.com/show_bug.cgi?id=399987
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c5
Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=399987
User olh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c6
Olaf Hering
https://bugzilla.novell.com/show_bug.cgi?id=399987
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c7
Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=399987
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c8
LTC BugProxy
what kernel is on the nfs server?
It is a pretty old kernel: Linux nfsserver 2.6.5-7.283-pseries64 #1 SMP Wed Nov 29 16:55:53 UTC 2006 ppc64 ppc64 ppc64 GNU/Linux Nfs is AFAIC part of the kernel. Are there any messages displayed on console 10? I'm connected via Serial Over Lan (SOL) to the Blade so I don't have a console 10. But I booted with option ssh=1 and thus I could have a look at some log files. Ok, then could you provide the output of dmesg? I changed the NFS root server: kernel version: 2.6.22.1-41.fc7 on a Fedora 7 JS20 system. The problem still happens. Please find the dmesg output attached. The last line is: lockd: server 10.64.4.131 not responding, still trying But I'm still able to cd to the mounted nfs directory and create files there. This message always occurs even if the installation onto NFS root was successful. ps aux tells me that is in state D ("uninterruptible sleep" usually I/O) root 2846 7.6 3.6 112140 73244 ttyS0 Dl+ 14:23 1:58 y2base installation ("initial") qt --noborder --auto-fonts The WCHAN for this process is "rpc_wait_bit_kill". So it looks like NFS is the root cause for this problem. I'll try to figure out what causes this situation. I just discovered that applying "nolock" as an additional mount option seems to work around this problem. Information was provided *** This bug has been marked as a duplicate of bug 399987 *** https://bugzilla.novell.com/show_bug.cgi?id=399987 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c9
--- Comment #9 from LTC BugProxy
https://bugzilla.novell.com/show_bug.cgi?id=399987
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c10
--- Comment #10 from LTC BugProxy
https://bugzilla.novell.com/show_bug.cgi?id=399987
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c11
--- Comment #11 from LTC BugProxy
https://bugzilla.novell.com/show_bug.cgi?id=399987
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=399987#c12
Suresh Jayaraman
participants (1)
-
bugzilla_noreply@novell.com