[Bug 395781] New: SMP PVM domU kernel oops
https://bugzilla.novell.com/show_bug.cgi?id=395781 User reitenbach@rapideye.de added comment https://bugzilla.novell.com/show_bug.cgi?id=395781#c1 Summary: SMP PVM domU kernel oops Product: openSUSE 10.3 Version: Final Platform: i686 OS/Version: openSUSE 10.3 Status: NEW Severity: Critical Priority: P5 - None Component: Xen AssignedTo: cgriffin@novell.com ReportedBy: reitenbach@rapideye.de QAContact: qa@suse.de Found By: Customer I reported the problem already on the xensource bugzilla, but they recommended to open another one here. Here is the original bug report: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1265 below, cut 'n paste of the contents of the bug report: I have a SLES10SP1, dom0, with Xen installed from SP2, so Xen 3.2.0 is there installed. The domU is a opensuse 10.3, i586 host. The dom0 has 2 dual core CPU's, and the domU has 4 processors configured. The domU suddenly stopped working, but it still answered pings, the system itself was idle, a ssh session with a top was showing load of 0.X. Logged in to the dom0, and xm list showed the virtual domain in blocked IO state. The dom0: uname -a Linux srv3 2.6.16.57-0.9-xen #1 SMP Mon Jan 21 19:55:27 UTC 2008 x86_64 x86_64 x86_64 GNU/Linux srv3:~ # xm info host : srv3 release : 2.6.16.57-0.9-xen version : #1 SMP Mon Jan 21 19:55:27 UTC 2008 machine : x86_64 nr_cpus : 4 nr_nodes : 1 cores_per_socket : 2 threads_per_core : 1 cpu_mhz : 2205 hw_caps : 178bf3ff:e3d3fbff:00000000:00000010:00000001:00000000:00000002 total_memory : 5975 free_memory : 1024 max_free_memory : 1174 max_para_memory : 1170 max_hvm_memory : 1159 node_to_cpu : node0:0-3 xen_major : 3 xen_minor : 2 xen_extra : .0_16718_02-0.5 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : 16718 cc_compiler : gcc version 4.1.2 20070115 (prerelease) (SUSE Linux) cc_compile_by : abuild cc_compile_domain : suse.de cc_compile_date : Tue Jan 22 01:13:56 UTC 2008 xend_config_format : 4 srv3:~ # rpm -qa | grep xen xen-libs-32bit-3.0.4_13138-0.60 xen-libs-3.2.0_16718_02-0.5 xen-tools-ioemu-3.2.0_16718_02-0.5 xen-doc-html-3.0.4_13138-0.60 xen-3.2.0_16718_02-0.5 kernel-xen-2.6.16.57-0.9 xen-kmp-smp-3.2.0_16718_02_2.6.16.57_0.9-0.5 xen-tools-3.2.0_16718_02-0.5 xen-doc-pdf-3.2.0_16718_02-0.5 The domU: rpm -qa | grep -i xen xen-3.1.0_15042-51.3 kernel-xenpae-2.6.22.17-0.1 xen-tools-3.1.0_15042-51.3 xen-libs-3.1.0_15042-51.3 tserver:~ # uname -a Linux tserver 2.6.22.17-0.1-xenpae #1 SMP 2008/02/10 20:01:04 UTC i686 athlon i386 GNU/Linux The dom0 configuration file: name="TS" uuid="1363fd4f-a276-31a2-bea5-be965e094ff1" memory=4600 vcpus=4 on_poweroff="destroy" on_reboot="restart" on_crash="destroy" localtime=0 builder="linux" bootloader="/usr/lib/xen/boot/domUloader.py" bootargs="--entry=xvda2:/boot/vmlinuz-xenpae,/boot/initrd-xenpae" extra="xencons=tty " disk=[ 'phy:/dev/cciss/c0d0p2,xvda,w', ] vif=[ 'mac=00:16:a4:a7:b4:14,bridge=bridge10', 'mac=00:16:3e:25:9e:a0,bridge=bridge3', 'mac=00:16:3e:25:9e:a1,bridge=bridge4', 'mac=00:16:3e:25:9e:a2,bridge=bridge5', 'mac=00:16:3e:25:9e:a3,bridge=bridge6', 'mac=00:16:3e:25:9e:a4,bridge=bridge7', 'mac=00:16:3e:25:9e:a5,bridge=bridge9', 'mac=00:16:3e:25:9e:a6,bridge=bridge12', ] nographic=1 ------- Comment #1 From sebastia@l00-bugdead-prods.de 2008-05-29 00:40 [reply] ------- It happened again, with a slightly different backtrace, but the beginning is the same: Oops: 0002 [#1] SMP last sysfs file: /devices/system/cpu/cpu3/online Modules linked in: appletalk ax25 ipx p8023 nfs lockd nfs_acl sunrpc iptable_filter ip_tables ip6_tables x_tables binfmt_misc 8250 serial_core loop dm_mod ext3 jbd mbcache xenblk xennet CPU: 0 EIP: 0061:[<c014a281>] Tainted: G N VLI EFLAGS: 00010046 (2.6.22.17-0.1-xenpae #1) EIP is at free_pages_bulk+0x11b/0x1a9 eax: c0349604 ebx: 00000000 ecx: 00100100 edx: c0349600 esi: c0349604 edi: 000001af ebp: c03495ec esp: de5cbd8c ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0069 Process nxagent (pid: 16299, ti=de5ca000 task=e9321570 task.ti=de5ca000) Stack: 00000001 c034860c 0000000d c0348580 00000000 00000001 c14c8580 c0348600 c0348580 00000000 c014a5f2 00000000 007c7740 00000000 de5cbdf0 de5cbdec de5cbe74 c014a65a c14c8580 c160f901 00000000 c014cc01 00000001 00000001 Call Trace: [<c014a5f2>] free_hot_cold_page+0x13e/0x18e [<c014a65a>] __pagevec_free+0x18/0x22 [<c014cc01>] release_pages+0x1a2/0x1aa [<c01489b8>] find_get_pages+0x28/0x77 [<c014d0db>] __pagevec_release+0x15/0x1d [<c014d765>] truncate_inode_pages_range+0x250/0x25d [<c014d789>] truncate_inode_pages+0x17/0x1a [<c01627e3>] shmem_delete_inode+0x33/0xc3 [<c01627b0>] shmem_delete_inode+0x0/0xc3 [<c01777b6>] generic_delete_inode+0xa1/0x107 [<c0176f54>] iput+0x60/0x62 [<c0175236>] d_kill+0x2a/0x43 [<c0175df2>] dput+0xe1/0xe8 [<c0167a7b>] __fput+0x138/0x159 [<c0157b95>] remove_vma+0x2a/0x3b [<c0158467>] do_munmap+0x19b/0x1b4 [<c01acda2>] sys_shmdt+0x83/0x10a [<c0108d68>] sys_ipc+0x181/0x1bb [<c0167722>] sys_read+0x41/0x67 [<c01048ee>] syscall_call+0x7/0xb ======================= Code: 00 74 62 0f 0b eb fe 83 fe 09 76 a3 89 75 0c 6b de 0c 0f ba 6d 00 13 8b 4c 24 0c 8d 45 18 8d 94 19 80 10 00 00 8b 4a 04 8d 72 04 <89> 41 04 89 4d 18 89 70 04 89 42 04 8b 44 24 0c ff 84 18 8c 10 EIP: [<c014a281>] free_pages_bulk+0x11b/0x1a9 SS:ESP 0069:de5cbd8c ------- Comment #2 From sebastia@l00-bugdead-prods.de 2008-05-30 00:38 [reply] ------- As the server has 4 CPU Cores, we thought, giving the domU 3 cores, and the last one to the dom0 could fix the problem, but that was not the case. It happened again, now with only 3 cores configured for the domU: Oops: 0011 [#1] SMP last sysfs file: /devices/system/cpu/cpu2/online Modules linked in: appletalk ax25 ipx p8023 nfs lockd nfs_acl sunrpc iptable_filter ip_tables ip6_tables x_tables binfmt_misc 8250 serial_core loop dm_mod ext3 jbd mbcache xenblk xennet CPU: 1 EIP: 0061:[<c14feca0>] Tainted: G N VLI EFLAGS: 00010206 (2.6.22.17-0.1-xenpae #1) EIP is at 0xc14feca0 eax: c133c8a0 ebx: c133c8a0 ecx: c100a21c edx: 00000000 esi: cb405ee8 edi: 00040000 ebp: 00000000 esp: cb405ea0 ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0069 Process kstartupconfig (pid: 14712, ti=cb404000 task=d686b030 task.ti=cb404000) Stack: c014a4cf bfd23fff 001154cf 00000003 cb405ee8 cb405ed8 c100a22c c014a65a c133c8a0 c0348580 c0348580 c014cc01 00000004 00000004 00000004 00000000 c130f880 c12eeac0 c1384480 c133c8a0 c12eea20 c130f820 c12eea60 c1392600 Call Trace: [<c014a4cf>] free_hot_cold_page+0x1b/0x18e [<c014a65a>] __pagevec_free+0x18/0x22 [<c014cc01>] release_pages+0x1a2/0x1aa [<c015d6a6>] free_pages_and_swap_cache+0x6b/0x7f [<c0157c5f>] exit_mmap+0xb9/0xe5 [<c011b05f>] mmput+0x21/0x78 [<c011fc95>] do_exit+0x1f3/0x755 [<c0125a85>] recalc_sigpending+0xb/0x1d [<c0125b38>] sigprocmask+0xa1/0xdc [<c0120282>] sys_exit_group+0x0/0xd [<c01048ee>] syscall_call+0x7/0xb ======================= Code: c1 01 00 58 ec 4f c1 18 26 4e c1 00 00 00 00 01 00 00 00 ff ff ff ff 94 7f 3b c0 00 00 00 00 80 d4 4b c1 00 01 10 00 00 02 20 00 <00> 00 00 00 01 00 00 00 ff ff ff ff b4 c8 33 c1 00 00 00 00 a0 EIP: [<c14feca0>] 0xc14feca0 SS:ESP 0069:cb405ea0 Fixing recursive fault but reboot is needed! ------------[ cut here ]------------ kernel BUG at lib/radix-tree.c:447! invalid opcode: 0000 [#2] SMP last sysfs file: /devices/system/cpu/cpu2/online Modules linked in: appletalk ax25 ipx p8023 nfs lockd nfs_acl sunrpc iptable_filter ip_tables ip6_tables x_tables binfmt_misc 8250 serial_core loop dm_mod ext3 jbd mbcache xenblk xennet CPU: 0 EIP: 0061:[<c01c5be9>] Tainted: G N VLI EFLAGS: 00010096 (2.6.22.17-0.1-xenpae #1) EIP is at radix_tree_tag_set+0x1c/0x94 eax: e577b510 ebx: 00000002 ecx: 00000001 edx: c1730540 esi: e577b50c edi: 00000000 ebp: e577b510 esp: c425bd6c ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0069 Process pdflush (pid: 80, ti=c425a000 task=c0b74ab0 task.ti=c425a000) Stack: c0348580 00000001 c1730540 c168a940 e577b50c 00000000 00000000 c014bed9 e577b40c c98a1ac0 c168a940 c168a940 ee269f63 c425bea4 e577b334 c168a940 c168a940 c168a940 c425bea4 ee26acc8 c425be48 c425bf74 e577b464 00000000 Call Trace: [<c014bed9>] test_set_page_writeback+0x60/0xab [<ee269f63>] nfs_page_async_flush+0x8a/0x101 [nfs] [<ee26acc8>] nfs_writepage_locked+0x97/0x170 [nfs] [<c014691d>] find_get_pages_tag+0x33/0x8f [<ee26adaa>] nfs_writepage+0x9/0x17 [nfs] [<c014b3b8>] __writepage+0x8/0x21 [<c014b718>] write_cache_pages+0x15d/0x275 [<c014b3b0>] __writepage+0x0/0x21 [<ee26b566>] nfs_flush_one+0x9e/0xe5 [nfs] [<ee26a202>] nfs_writepages+0x0/0x63 [nfs] [<c014b84f>] generic_writepages+0x1f/0x26 [<ee26a24b>] nfs_writepages+0x49/0x63 [nfs] [<ee26b4c8>] nfs_flush_one+0x0/0xe5 [nfs] [<ee26a202>] nfs_writepages+0x0/0x63 [nfs] [<c014b876>] do_writepages+0x20/0x30 [<c017f25f>] __writeback_single_inode+0x1a1/0x320 [<c017f6bc>] sync_sb_inodes+0x16a/0x225 [<c017fab7>] writeback_inodes+0x6a/0xb3 [<c014be17>] wb_kupdate+0x7c/0xde [<c014c21f>] pdflush+0x149/0x1fb [<c014bd9b>] wb_kupdate+0x0/0xde [<c014c0d6>] pdflush+0x0/0x1fb [<c012df3a>] kthread+0x38/0x5e [<c012df02>] kthread+0x0/0x5e [<c0104c07>] kernel_thread_helper+0x7/0x10 ======================= Code: ff ff ff eb 02 31 c0 83 c4 0c 5b 5e 5f 5d c3 55 89 c5 57 56 53 83 ec 0c 89 54 24 08 89 4c 24 04 8b 18 3b 14 9d 28 89 37 c0 76 04 <0f> 0b eb fe 8b 78 08 6b c3 06 8d 70 fa 8b 44 24 04 8d 04 c5 10 EIP: [<c01c5be9>] radix_tree_tag_set+0x1c/0x94 SS:ESP 0069:c425bd6c ------- Comment #3 From Ian Pratt 2008-05-30 02:03 [reply] ------- You're probably best off reporting this on the Novell bugzilla as I'm not aware of this being reported against vanilla kernels. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=395781
User jfehlig@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=395781#c1
James Fehlig
https://bugzilla.novell.com/show_bug.cgi?id=395781
Jason Douglas
participants (1)
-
bugzilla_noreply@novell.com