[Bug 399987] New: lockd confused on nfs client when installing onto nfsroot
https://bugzilla.novell.com/show_bug.cgi?id=399987 Summary: lockd confused on nfs client when installing onto nfsroot Product: openSUSE 11.0 Version: Final Platform: PowerPC OS/Version: Linux Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: olh@novell.com QAContact: qa@suse.de Found By: --- I tried to install 11.0 snapshots onto nfsroot several times in the last weeks. While this works ok on blades and other clients, it does not work so well on an ibook G4 if the server is my G5 workstation. blackberry has this in /etc/exports: /tftpboot 10.10.0.0/255.255.0.0(rw,async,insecure,anonuid=65534,anongid=65534,no_root_squash) The inst-sys uses these mount options: inst-sys:~ # cat /proc/mounts rootfs / rootfs rw 0 0 tmpfs / tmpfs rw,size=0k,nr_inodes=0 0 0 tmpfs / tmpfs rw,size=0k,nr_inodes=0 0 0 proc /proc proc rw 0 0 sysfs /sys sysfs rw 0 0 /dev/loop0 /mounts/mp_0000 squashfs ro 0 0 /dev/loop1 /mounts/mp_0001 squashfs ro 0 0 /dev/loop2 /mounts/mp_0002 squashfs ro 0 0 /dev/loop3 /mounts/mp_0003 squashfs ro 0 0 /dev/loop1 /usr/bin/gdb squashfs ro 0 0 devpts /dev/pts devpts rw,mode=600 0 0 10.10.1.184:/nfsroot/coconut /mnt nfs rw,vers=2,rsize=8192,wsize=8192,hard,proto=tcp,timeo=600,retrans=2,sec=sys,addr=10.10.1.184 0 0 tmpfs /mnt/dev tmpfs rw,size=0k,nr_inodes=0 0 0 proc /mnt/proc proc rw 0 0 sysfs /mnt/sys sysfs rw 0 0 inst-sys:~ # but after some time, when the install images are extracted, I get this in dmesg on the client: lockd: server 10.10.1.184 not responding, still trying I can still ssh from the ibook to the G5. I can still cat files from the nfs mount. But yast hangs: --- Exception: c01 at 0xf8f9c50 LR = 0x1004d134 y2base D 0fc26744 0 2709 2084 Call Trace: [e6343b70] [00000001] 0x1 (unreliable) [e6343c30] [c0009278] __switch_to+0x78/0x90 [e6343c50] [c033a588] schedule+0x3c8/0x404 [e6343ca0] [ea204810] rpc_wait_bit_killable+0x44/0x4c [sunrpc] [e6343cb0] [c033ab18] __wait_on_bit+0x68/0xc0 [e6343cd0] [c033ac34] out_of_line_wait_on_bit+0xc4/0xd8 [e6343d20] [ea205030] __rpc_execute+0x14c/0x2b4 [sunrpc] [e6343d60] [ea1fe834] rpc_run_task+0x64/0x7c [sunrpc] [e6343d70] [ea1fe9c0] rpc_call_sync+0x5c/0x8c [sunrpc] [e6343db0] [ea1c18ac] nlmclnt_call+0xcc/0x288 [lockd] [e6343e20] [ea1c2124] nlmclnt_proc+0x2d4/0x5f0 [lockd] [e6343e60] [ea298548] nfs_proc_lock+0x24/0x34 [nfs] [e6343e70] [ea290c50] do_setlk+0x68/0xcc [nfs] [e6343e90] [c00b1dfc] fcntl_setlk64+0x1ac/0x324 [e6343f20] [c00adb44] sys_fcntl64+0x80/0xbc [e6343f40] [c001351c] ret_from_syscall+0x0/0x40 The server was openSuSE 10.3 or 11.0. The client was booted with these options from the 11.0 DVD: quiet sysrq=1 netsetup=1 hostip=10.10.2.36/16 gateway=10.10.0.8 nameserver=10.10.2.88 start_shell -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 Lars Marowsky-Bree <lmb@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|kernel-maintainers@forge.provo.novell.com |sjayaraman@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User sjayaraman@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c1 Suresh Jayaraman <sjayaraman@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #1 from Suresh Jayaraman <sjayaraman@novell.com> 2008-07-08 04:51:01 MDT --- Few obvious questions after a quick look: * Do you have firewall enabled between Server and Clients? If yes, could you disable and try? * rpc.statd is running? statd is a userspace process from OpenSUSE 10.3 and also in OpenSUSE 11.0. Please ensure that rpc.statd is running on the Server. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 Suresh Jayaraman <sjayaraman@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |olh@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User sjayaraman@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c2 --- Comment #2 from Suresh Jayaraman <sjayaraman@novell.com> 2008-07-08 05:07:32 MDT --- Also, make sure whether rpc.statd is running on NFS client as well. statd should have been started during mount. There is a related bug here as well Bug #384481. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User olh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c3 --- Comment #3 from Olaf Hering <olh@novell.com> 2008-07-08 05:43:49 MDT --- there is no firewall. /usr/sbin/start-statd is not in installation-images, I have added it now. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User olh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c4 Olaf Hering <olh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|olh@novell.com | --- Comment #4 from Olaf Hering <olh@novell.com> 2008-07-08 06:38:40 MDT --- running rpc.statd on client does not help running rpc.statd on client and server does not help -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 Olaf Hering <olh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Found By|--- |Development -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User sjayaraman@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c5 Suresh Jayaraman <sjayaraman@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |olh@novell.com --- Comment #5 from Suresh Jayaraman <sjayaraman@novell.com> 2008-07-30 06:17:02 MDT --- The call trace shows no obvious issues. Could you please attach a packet capture? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User olh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c6 Olaf Hering <olh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|olh@novell.com | --- Comment #6 from Olaf Hering <olh@novell.com> 2008-08-04 07:35:42 MDT --- dumps can be found here: http://boettger.suse.de/inst/olh/bug399987/ command used on the nfsserver, blackberry.suse.de: tcpdump -vvvne -tttt -w /Test/From-coconut-nfsdump.bin \ -i eth0 ether host 00:0A:95:AA:0C:FA command used on the client, coconut.suse.de, booted from 11.0 DVD: tcpdump -vvvne -tttt -w /DUMP/coconut-nfsdump.bin -i eth0 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User sjayaraman@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c7 Suresh Jayaraman <sjayaraman@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bugproxy@us.ibm.com --- Comment #7 from Suresh Jayaraman <sjayaraman@novell.com> 2008-12-02 01:41:53 MST --- *** Bug 406727 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=406727 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User bugproxy@us.ibm.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c8 LTC BugProxy <bugproxy@us.ibm.com> changed: What |Removed |Added ---------------------------------------------------------------------------- URL| |http:// --- Comment #8 from LTC BugProxy <bugproxy@us.ibm.com> 2008-12-02 01:45:13 MST --- =Comment: #0================================================= Jochen Roth <jroth@de.ibm.com> - 2008-07-07 05:46 EDT I tried to install openSUSE 11 onto an NFS root filesystem on a QS21 Blade. Sometimes the installation finishes, but sometimes it hangs. I had success when trying to install with the default packages three times in a row. I removed the Novell AppArmor and added some development packages to the installation. With this setup it hang during unpacking of one of the BASE packages. I installed using a DVD and typing "install vnc=1 ssh=1" and the command line prompt. The last message I got on the SOL console was: *** Starting YaST2 *** error: cannot open Packages index using db3 - No such file or directory (2) This error message also appeared when the installation was successful. I connected to the ssh server and tailed /var/log/YaST2/y2log (see attachment) The reason for the installer to fail might be related to rpm on nfs: Berkeley DB and NFS don't really go together well at all: http://www.oracle.com/technology/documentation/berkeley-db/db/ref/env/remote... =Comment: #2================================================= Jochen Roth <jroth@de.ibm.com> - 2008-07-07 06:31 EDT y2log what kernel is on the nfs server? (In reply to comment #6)
what kernel is on the nfs server?
It is a pretty old kernel: Linux nfsserver 2.6.5-7.283-pseries64 #1 SMP Wed Nov 29 16:55:53 UTC 2006 ppc64 ppc64 ppc64 GNU/Linux Nfs is AFAIC part of the kernel. Are there any messages displayed on console 10? I'm connected via Serial Over Lan (SOL) to the Blade so I don't have a console 10. But I booted with option ssh=1 and thus I could have a look at some log files. Ok, then could you provide the output of dmesg? I changed the NFS root server: kernel version: 2.6.22.1-41.fc7 on a Fedora 7 JS20 system. The problem still happens. Please find the dmesg output attached. The last line is: lockd: server 10.64.4.131 not responding, still trying But I'm still able to cd to the mounted nfs directory and create files there. This message always occurs even if the installation onto NFS root was successful. ps aux tells me that is in state D ("uninterruptible sleep" usually I/O) root 2846 7.6 3.6 112140 73244 ttyS0 Dl+ 14:23 1:58 y2base installation ("initial") qt --noborder --auto-fonts The WCHAN for this process is "rpc_wait_bit_kill". So it looks like NFS is the root cause for this problem. I'll try to figure out what causes this situation. I just discovered that applying "nolock" as an additional mount option seems to work around this problem. Information was provided *** This bug has been marked as a duplicate of bug 399987 *** https://bugzilla.novell.com/show_bug.cgi?id=399987 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User bugproxy@us.ibm.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c9 --- Comment #9 from LTC BugProxy <bugproxy@us.ibm.com> 2008-12-02 01:45:16 MST --- Created an attachment (id=257160) --> (https://bugzilla.novell.com/attachment.cgi?id=257160) dmesg output -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User bugproxy@us.ibm.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c10 --- Comment #10 from LTC BugProxy <bugproxy@us.ibm.com> 2008-12-02 01:45:17 MST --- Created an attachment (id=257161) --> (https://bugzilla.novell.com/attachment.cgi?id=257161) y2log -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User bugproxy@us.ibm.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c11 --- Comment #11 from LTC BugProxy <bugproxy@us.ibm.com> 2009-02-11 09:01:22 MST --- ------- Comment From hannsj_uhl@de.ibm.com 2009-02-11 10:42 EDT------- Hello Novell, fyi ... we are not pursueing this bugzilla for openSUSE11.1 because this capability is working with SLES11 and therefore we are closing this bugzilla now .. Thanks for your support. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=399987 User sjayaraman@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=399987#c12 Suresh Jayaraman <sjayaraman@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |olh@novell.com --- Comment #12 from Suresh Jayaraman <sjayaraman@novell.com> 2009-02-27 05:01:46 MST --- Had been busy with other higher priority ones, lately, sorry. Is this still reproducible on 11.1? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com