[Bug 849387] New: NFS file systems unmounted 15 minutes after reboot
https://bugzilla.novell.com/show_bug.cgi?id=849387 https://bugzilla.novell.com/show_bug.cgi?id=849387#c0 Summary: NFS file systems unmounted 15 minutes after reboot Classification: openSUSE Product: openSUSE 12.3 Version: Final Platform: x86-64 OS/Version: openSUSE 12.3 Status: NEW Severity: Normal Priority: P5 - None Component: Network AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: R.Vickers@cs.rhul.ac.uk QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0 We have a server which is both an NFS server and NFS client. After booting all the NFS file systems in fstab successfully mount, but 15 minutes later an sm-notify failure causes them to be unmounted again. This has happened after at least 2 reboots. So there are 2 mysteries here: (1) Why does sm-notify fail? (2) Why does this failure cause systemd to tear down the NFS client services? The key messages in the log are 2013-11-07T08:31:59.117550+00:00 teaching sm-notify[1953]: Version 1.2.7 starting 2013-11-07T08:31:59.284828+00:00 teaching sm-notify[1953]: Backgrounding to notify hosts... 2013-11-07T08:31:59.296556+00:00 teaching nfs[1899]: Starting NFS client services: sm-notify idmapd..done 2013-11-07T08:32:04.435916+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec572] RPC status 1 2013-11-07T08:32:04.528118+00:00 teaching sm-notify[2022]: Version 1.2.7 starting 2013-11-07T08:32:04.528564+00:00 teaching sm-notify[2022]: Already notifying clients; Exiting! 2013-11-07T08:32:04.529097+00:00 teaching nfsserver[1900]: Starting kernel based NFS server: idmapd mountd statd nfsd sm-notify..done 2013-11-07T08:32:11.616881+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec573] RPC status 1 2013-11-07T08:32:15.621965+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec574] RPC status 1 2013-11-07T08:32:23.632100+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec575] RPC status 1 2013-11-07T08:32:39.644970+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec576] RPC status 1 2013-11-07T08:33:11.658055+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec577] RPC status 1 2013-11-07T08:34:15.724413+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec578] RPC status 1 2013-11-07T08:36:15.826645+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec579] RPC status 1 2013-11-07T08:38:15.919664+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec57a] RPC status 1 2013-11-07T08:40:15.960524+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec57b] RPC status 1 2013-11-07T08:42:16.062979+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec57c] RPC status 1 2013-11-07T08:44:16.165623+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec57d] RPC status 1 2013-11-07T08:46:16.253668+00:00 teaching sm-notify[1960]: nsm_parse_reply: [0x527ec57e] RPC status 1 2013-11-07T08:48:16.353764+00:00 teaching sm-notify[1960]: Unable to notify mailhost.cs.rhul.ac.uk, giving up 2013-11-07T08:48:16.465687+00:00 teaching nfs[7573]: Shutting down NFS client services:umount.nfs4: /rmt/csfiles/pgrads: device is busy 2013-11-07T08:48:16.497619+00:00 teaching nfs[7573]: umount.nfs4: /rmt/csfiles/staff: device is busy 2013-11-07T08:48:16.567285+00:00 teaching nfs[7573]: umount: /var/lib/nfs/rpc_pipefs: target is busy. 2013-11-07T08:48:16.568078+00:00 teaching nfs[7573]: (In some cases useful info about processes that use 2013-11-07T08:48:16.568571+00:00 teaching nfs[7573]: the device is found by lsof(8) or fuser(1)) 2013-11-07T08:48:16.569186+00:00 teaching nfs[7573]: ..failed 2013-11-07T08:48:16.569418+00:00 teaching systemd[1]: nfs.service: control process exited, code=exited status=1 2013-11-07T08:48:16.581167+00:00 teaching systemd[1]: Unit nfs.service entered failed state There is only one host which sm-notify is failing to contact. It is an NFS client and the corresponding file in /var/lib/nfs is -rw------- 1 statd nogroup 94 Jul 31 13:20 /var/lib/nfs/sm.bak/mailhost.cs.rhul.ac.uk and its contents are 0100007f 000186b5 00000003 00000010 cbb2ce7f5dc6110000c869900388ffff 134.219.205.131 teaching If I run sm-notify by hand I get: teaching# sm-notify -df -m 1 sm-notify: Version 1.2.7 starting sm-notify: Added host mailhost.cs.rhul.ac.uk to notify list sm-notify: Effective UID, GID: 103, 65534 sm-notify: Sending PMAP_GETPORT for 100024, 1, udp sm-notify: Added host mailhost.cs.rhul.ac.uk to notify list sm-notify: Host mailhost.cs.rhul.ac.uk due in 2 seconds sm-notify: Received packet... sm-notify: nsm_parse_reply: [0x527306d7] RPC status 1 sm-notify: Host mailhost.cs.rhul.ac.uk due in 2 seconds sm-notify: Sending PMAP_GETPORT for 100024, 1, udp sm-notify: Added host mailhost.cs.rhul.ac.uk to notify list sm-notify: Host mailhost.cs.rhul.ac.uk due in 4 seconds sm-notify: Received packet... sm-notify: nsm_parse_reply: [0x527306d8] RPC status 1 sm-notify: Host mailhost.cs.rhul.ac.uk due in 4 seconds sm-notify: Sending PMAP_GETPORT for 100024, 1, udp sm-notify: Added host mailhost.cs.rhul.ac.uk to notify list sm-notify: Host mailhost.cs.rhul.ac.uk due in 8 seconds sm-notify: Received packet... sm-notify: nsm_parse_reply: [0x527306d9] RPC status 1 sm-notify: Host mailhost.cs.rhul.ac.uk due in 8 seconds sm-notify: Sending PMAP_GETPORT for 100024, 1, udp sm-notify: Added host mailhost.cs.rhul.ac.uk to notify list sm-notify: Host mailhost.cs.rhul.ac.uk due in 16 seconds sm-notify: Received packet... sm-notify: nsm_parse_reply: [0x527306da] RPC status 1 sm-notify: Host mailhost.cs.rhul.ac.uk due in 16 seconds sm-notify: Sending PMAP_GETPORT for 100024, 1, udp sm-notify: Added host mailhost.cs.rhul.ac.uk to notify list sm-notify: Host mailhost.cs.rhul.ac.uk due in 32 seconds sm-notify: Received packet... sm-notify: nsm_parse_reply: [0x527306db] RPC status 1 sm-notify: Host mailhost.cs.rhul.ac.uk due in 32 seconds sm-notify: Unable to notify mailhost.cs.rhul.ac.uk, giving up Reproducible: Always Steps to Reproduce: It is reproducable on my server, but I don't know how to reproduce it on other machines. Both server and client are running opensuse 12.3 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c
zhang jiajun
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c1
Christian Boltz
So there are 2 mysteries here:
There's a 3rd mystery - why did Jia Jun Zhang assign this bug to me? I don't know much about NFS, and also don't see anything that could be related to AppArmor (which is the only possible reason why this bug could have been assigned to me). I'll reassign to Neil, the maintainer of nfs-utils. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c2
Neil Brown
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c3
Frederic Crozat
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c4
Neil Brown
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c5
--- Comment #5 from Bob Vickers
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c6
--- Comment #6 from Bob Vickers
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c7
Neil Brown
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c8
--- Comment #8 from Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=849387
https://bugzilla.novell.com/show_bug.cgi?id=849387#c9
Neil Brown
participants (1)
-
bugzilla_noreply@novell.com