[Bug 558663] New: dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
http://bugzilla.novell.com/show_bug.cgi?id=558663 http://bugzilla.novell.com/show_bug.cgi?id=558663#c0 Summary: dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead Classification: openSUSE Product: openSUSE 11.2 Version: Final Platform: x86-64 OS/Version: openSUSE 11.2 Status: NEW Severity: Critical Priority: P5 - None Component: Xen AssignedTo: jdouglas@novell.com ReportedBy: udo1@udo.hu QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; hu; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729) If you limit the dom0 cpu with dom0-cpus: - [xenwatch_cb] is running 100% cpu and makes var log entry every 65 sec BUG: soft lockup - CPU#X stuck for 61s! - xm commands not work - xend is dead ***************************** PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4532 root 15 -5 0 0 0 R 100 0.0 11:14.84 xenwatch_cb # ps aux |grep xen root 39 0.0 0.0 0 0 ? S< 13:03 0:00 [xenwatch] root 40 0.0 0.0 0 0 ? S< 13:03 0:00 [xenbus] root 3791 0.0 0.0 11300 1560 ? S 13:04 0:00 /bin/bash /etc/init.d/xend start root 4209 0.0 0.1 107504 13864 ? S 13:04 0:00 /usr/bin/python2.6 /usr/sbin/xend start root 4446 0.0 0.0 8488 1000 ? S 13:04 0:00 xenstored --pid-file /var/run/xenstore.pid root 4448 0.0 0.0 0 0 ? Z 13:04 0:00 [xenconsoled] <defunct> root 4450 0.0 0.0 0 0 ? Zs 13:04 0:00 [xend] <defunct> root 4451 0.0 0.1 107500 11500 ? S 13:04 0:00 /usr/bin/python2.6 /usr/sbin/xend start root 4453 0.0 0.0 22724 560 ? Sl 13:04 0:00 xenconsoled root 4455 0.0 0.2 148304 16652 ? Sl 13:04 0:00 /usr/bin/python2.6 /usr/sbin/xend start root 4532 100 0.0 0 0 ? R< 13:04 40:35 [xenwatch_cb] root 4533 0.0 0.0 0 0 ? D< 13:04 0:00 [xenwatch_cb] root 4534 0.0 0.0 0 0 ? D< 13:04 0:00 [xenwatch_cb] root 4535 0.0 0.0 0 0 ? D< 13:04 0:00 [xenwatch_cb] root 4536 0.0 0.0 0 0 ? D< 13:04 0:00 [xenwatch_cb] from /var/log/messages every 65 sec Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] BUG: soft lockup - CPU#4 stuck for 61s! [xenwatch_cb:4532] Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: sha1_generic hmac cryptomgr aead pcompress crypto_ blkcipher crypto_hash crypto_algapi drbd netbk blkbk blkback_pagemap blktap xenbus_be binfmt_misc xt_tcpudp ip6t_REJ ECT nf_conntrack_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_physdev xt_state iptable_raw iptable_filter ip6table_man gle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tab les ipv6 bridge stp llc dummy fuse loop dm_mod mptctl iTCO_wdt iTCO_vendor_support i5k_amb sg i5000_edac ppdev 8250_ pnp pcspkr sr_mod edac_core parport_pc shpchp e1000e dcdbas 8250 pci_hotplug tg3 parport serio_raw serial_core butto n usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic piix ide_core ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas thermal processor thermal_sys hwmon Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] CPU 4: Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: sha1_generic hmac cryptomgr aead pcompress crypto_blkcipher crypto_hash crypto_algapi drbd netbk blkbk blkback_pagemap blktap xenbus_be binfmt_misc xt_tcpudp ip6t_REJECT nf_conntrack_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_physdev xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 bridge stp llc dummy fuse loop dm_mod mptctl iTCO_wdt iTCO_vendor_support i5k_amb sg i5000_edac ppdev 8250_pnp pcspkr sr_mod edac_core parport_pc shpchp e1000e dcdbas 8250 pci_hotplug tg3 parport serio_raw serial_core button usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic piix ide_core ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas thermal processor thermal_sys hwmon Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RIP: e030:[<ffffffff8005f07f>] [<ffffffff8005f07f>] lock_timer_base+ 0x7f/0x90 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RSP: e02b:ffff8801e8d0bc10 EFLAGS: 00000246 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff80778370 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RDX: 0000000000000007 RSI: ffff8801e8d0bc50 RDI: ffffc90000075280 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RBP: ffff8801e8d0bc40 R08: ffffffff807813b0 R09: 0000000000000000 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R10: ffff8801e8d0bcf0 R11: 00000000e15cfb6d R12: ffffc90000075280 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R13: ffff8801e8d0bc50 R14: 0000000000000000 R15: ffffffff80778600 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] FS: 00007f53d0abf6f0(0000) GS:ffffc90000040000(0000) knlGS:0000000000000000 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CR2: 00007f53d0691260 CR3: 0000000000003000 CR4: 0000000000002660 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] Call Trace: Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f0bc>] try_to_del_timer_sync+0x2c/0x90 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f14a>] del_timer_sync+0x2a/0x50 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046758f>] mce_cpu_callback+0x122/0x1aa Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80471de7>] notifier_call_chain+0x57/0xb0 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80075a1c>] __raw_notifier_call_chain+0x1c/0x40 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045b90f>] _cpu_down+0xaf/0x310 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045bbf7>] cpu_down+0x87/0xb0 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a42c>] vcpu_hotplug+0xce/0x102 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a4ab>] handle_vcpu_hotplug_event+0x4b/0x61 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80306c4c>] xenwatch_handle_callback+0x2c/0x80 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8006fb96>] kthread+0xb6/0xc0 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8000d38a>] child_rip+0xa/0x20 Reproducible: Always Steps to Reproduce: 1. set dom0-cpus = X where X>0 and X<[CPUS in your system] in /etc/xen/xend-config.sxp 2. reboot or just rcxend restart Actual Results: - [xenwatch_cb] is running 100% cpu and makes var log entry every 65 sec BUG: soft lockup - CPU#X stuck for 61s! - xm commands not work - xend is dead Expected Results: No error Dell Server 2xquadcore=8 CPU, 8Gb Ram installed. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=558663
http://bugzilla.novell.com/show_bug.cgi?id=558663#c
Charles Arnold
http://bugzilla.novell.com/show_bug.cgi?id=558663
http://bugzilla.novell.com/show_bug.cgi?id=558663#c1
Jan Beulich
http://bugzilla.novell.com/show_bug.cgi?id=558663
http://bugzilla.novell.com/show_bug.cgi?id=558663#c2
--- Comment #2 from Jan Beulich
http://bugzilla.novell.com/show_bug.cgi?id=558663
http://bugzilla.novell.com/show_bug.cgi?id=558663#c3
--- Comment #3 from Udo Attila Fischer
http://bugzilla.novell.com/show_bug.cgi?id=558663
http://bugzilla.novell.com/show_bug.cgi?id=558663#c4
Jan Beulich
http://bugzilla.novell.com/show_bug.cgi?id=558663
http://bugzilla.novell.com/show_bug.cgi?id=558663#c5
Jan Beulich
http://bugzilla.novell.com/show_bug.cgi?id=558663
http://bugzilla.novell.com/show_bug.cgi?id=558663#c6
Jan Beulich
participants (1)
-
bugzilla_noreply@novell.com