[Bug 825373] New: NFS3 & 4 server kernel hang on i/o load from client
https://bugzilla.novell.com/show_bug.cgi?id=825373 https://bugzilla.novell.com/show_bug.cgi?id=825373#c0 Summary: NFS3 & 4 server kernel hang on i/o load from client Classification: openSUSE Product: openSUSE 12.3 Version: Final Platform: x86-64 OS/Version: openSUSE 12.3 Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: otrebor@hispeed.ch QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:20.0) Gecko/20100101 Firefox/20.0 I am trying to setup two Xen-DomU's. One as a nfs server and the other one as a nfs client. The client can be in flavors of OS12.{1,2,3} and the vm's have been successfully in use to an external nas supporting nfs3 only. No changes besides mounting nfs client wise to the new nfs server have been made to the clients. The nfs client domU reports the following in the logs: un 17 22:37:02 dome121 kernel: [ 997.976208] nfs: server vnas.example.com not responding, still trying Jun 17 22:44:24 dome121 kernel: [ 1440.196162] INFO: task as:3369 blocked for more than 480 seconds. Jun 17 22:44:24 dome121 kernel: [ 1440.196166] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 17 22:44:24 dome121 kernel: [ 1440.196169] as D 0000000000000000 0 3369 3366 0x00000000 Jun 17 22:44:24 dome121 kernel: [ 1440.196174] ffff88003d0bbcf8 0000000000000286 ffff88003d0bbe78 ffff88003c9344d0 Jun 17 22:44:24 dome121 kernel: [ 1440.196179] ffff88003d0bbfd8 ffff88003d0a0380 ffff88003d0bbfd8 ffff88003d0a0380 Jun 17 22:44:24 dome121 kernel: [ 1440.196183] ffff880002170340 ffff88003d0a0380 0000000000000000 0000000000000000 Jun 17 22:44:24 dome121 kernel: [ 1440.196187] Call Trace: Jun 17 22:44:24 dome121 kernel: [ 1440.196204] [<ffffffff80508c9a>] io_schedule+0x8a/0xd0 Jun 17 22:44:24 dome121 kernel: [ 1440.196211] [<ffffffff800d3189>] sleep_on_page+0x9/0x10 Jun 17 22:44:24 dome121 kernel: [ 1440.196217] [<ffffffff805094af>] __wait_on_bit+0x4f/0x80 Jun 17 22:44:24 dome121 kernel: [ 1440.196222] [<ffffffff800d32bf>] wait_on_page_bit+0x6f/0x80 Jun 17 22:44:24 dome121 kernel: [ 1440.196227] [<ffffffff800d33c6>] filemap_fdatawait_range+0xf6/0x180 Jun 17 22:44:24 dome121 kernel: [ 1440.196233] [<ffffffff800d4ce0>] filemap_write_and_wait_range+0x70/0x80 Jun 17 22:44:24 dome121 kernel: [ 1440.196249] [<ffffffffa0171c0d>] nfs_file_fsync+0x5d/0x130 [nfs] Jun 17 22:44:24 dome121 kernel: [ 1440.196266] [<ffffffff8011d9b8>] filp_close+0x38/0x90 Jun 17 22:44:24 dome121 kernel: [ 1440.196271] [<ffffffff8011dcf9>] sys_close+0xc9/0x160 Jun 17 22:44:24 dome121 kernel: [ 1440.196277] [<ffffffff80512b53>] system_call_fastpath+0x16/0x1b Jun 17 22:44:24 dome121 kernel: [ 1440.196288] [<00007f2ec7ebe110>] 0x7f2ec7ebe10f The server runs on a Dom0 server (supermicro server board X86_64) running latest OS12.3 fully patched. Dom0 config: 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/Ivy Bridge DRAM Controller (rev 09) 00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04) 00:16.1 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #2 (rev 04) 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 05) 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5) 00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b5) 00:1c.5 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 6 (rev b5) 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 05) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5) 00:1f.0 ISA bridge: Intel Corporation C204 Chipset Family LPC Controller (rev 05) 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller (rev 05) 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 05) 01:00.0 RAID bus controller: Adaptec Series 6 - 6G SAS/PCIe 2 (rev 01) 02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection 03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection 04:03.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a) The nfs server domU does not report anything in its logs. Neither in /var/log/messages nor in dmesg In previous OS12.3 patch levels I've had the nfs server domU to just dissapear from the xen top view. No logs produced. I've had this kind of setup successfully running with previous stock releases of Xen and OS11.x and OS12.x except OS12.3 Reproducible: Always Steps to Reproduce: 1. Create a DomU nfs3 & 4 server export a /home directory 2. Mount the /home directory in a client DomU and ls around the mounted directories -> works as expected 3. Create some load on the mounted nfs filesystem on the client DomU e.g. a compile job -> crashes or hangs both or at least one DomU Actual Results: Nfs server domU just dissapears (crashes) or hangs. Must be rebooted. NFs client domU hangs. Network interface unusable. Must be rebooted. Expected Results: NFS server and client should be able to cope with a compile job on a shared /home directory. This has worked with the very same client DomU's with having the /home dir on a nas device supporting nfs3 only. No other changes to the client DomU's were made. No logs or kernel dumps produced besides the client log entries included above. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c1
--- Comment #1 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c2
--- Comment #2 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c3
--- Comment #3 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c4
--- Comment #4 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c5
--- Comment #5 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c6
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c7
Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c8
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c9
Otrebor Igorig
From 192.168.200.121 icmp_seq=1 Destination Host Unreachable From 192.168.200.121 icmp_seq=2 Destination Host Unreachable From 192.168.200.121 icmp_seq=3 Destination Host Unreachable ^C --- 192.168.200.1 ping statistics --- 5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4016ms
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c10
--- Comment #10 from Otrebor Igorig
From 192.168.200.121 icmp_seq=1 Destination Host Unreachable From 192.168.200.121 icmp_seq=2 Destination Host Unreachable From 192.168.200.121 icmp_seq=3 Destination Host Unreachable ^C --- 192.168.200.1 ping statistics --- 5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4016ms
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c11
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c12
--- Comment #12 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c
Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c13
--- Comment #13 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c14
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c15
Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c16
--- Comment #16 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c17
--- Comment #17 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c18
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c19
--- Comment #19 from Jiri Bohac
Nothing stands out from the logs.
Comment #12 shows a kernel oops somewhere in ext4 on the NFS server, so it seems this has nothing to do with netback/netfront. I suppose the /var/log/messages fron Comment #15, Comment #16 and Comment #17 are not taken after a crash, but purely to show how the hosts normally boot. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c20
--- Comment #20 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c21
Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c22
--- Comment #22 from Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c23
Otrebor Igorig
https://bugzilla.novell.com/show_bug.cgi?id=825373
https://bugzilla.novell.com/show_bug.cgi?id=825373#c24
Jan Beulich
participants (1)
-
bugzilla_noreply@novell.com