[Bug 358531] New: Random DomU locks
https://bugzilla.novell.com/show_bug.cgi?id=358531 User ralf@bj-ig.de added comment https://bugzilla.novell.com/show_bug.cgi?id=358531#c1 Summary: Random DomU locks Product: openSUSE 10.3 Version: Final Platform: x86-64 OS/Version: openSUSE 10.3 Status: NEW Severity: Normal Priority: P5 - None Component: Xen AssignedTo: cgriffin@novell.com ReportedBy: ralf@bj-ig.de QAContact: qa@suse.de Found By: Customer I get the following type of kernel oops randomly - about once a week. The host system is a Core2 Duo, 4GB Ram, Opensuse 10.3, x86_64. The affected DomU is Opensuse 10.3, xenpae. Block devices are given as "phy:" devices to the DomU. All of them live on software RAID1/LVM2 controlled by Dom0. The used Filesystem is XFS. The affected DomU is a mail server (postfix/amavis/antivir/clamav/cyrus). Only this single DomU seems to have this problem. There are several other DomU's on this host which do not show such symptoms. smtp login: Bad page state in process 'find' page:c115b9c0 flags:0x0008000c mapping:00000000 mapcount:0 count:0 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c115b9c0 flags:0x00008068 mapping:00000000 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c11cd340 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c11cd360 flags:0x00000028 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c13128e0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1312900 flags:0x00000060 mapping:c03a1da5 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1309060 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1309080 flags:0x00008068 mapping:c79a1d45 mapcount:1 count:2 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c126e120 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c126e140 flags:0x0001006c mapping:c72f7304 mapcount:2 count:3 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c113e080 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c113e0a0 flags:0x00000080 mapping:00000000 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1276740 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1276760 flags:0x0000802c mapping:c03a1da5 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c126c040 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c126c060 flags:0x00000028 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10b39e0 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c102f5c0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c102f5e0 flags:0x00008068 mapping:c03a1da5 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c124b520 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c124b540 flags:0x00000060 mapping:c79a1b4d mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c114a060 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c114a080 flags:0x00008068 mapping:c03a1da5 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c103c880 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c103c8a0 flags:0x00000060 mapping:c79a1fc1 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c13180e0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'syslog-ng' page:c1318100 flags:0x00000060 mapping:c03a1da5 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10da9e0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10daa00 flags:0x0001006c mapping:c6e33d84 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c122d6c0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c122d6e0 flags:0x0001006c mapping:caa6ba44 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1160040 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1160060 flags:0x00000060 mapping:c79a1fc1 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1052ea0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1052ec0 flags:0x0000002c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c105dea0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c105dec0 flags:0x00000080 mapping:00000000 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10bca40 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10bca60 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10b7a80 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10b7aa0 flags:0x00000060 mapping:c03a1da5 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1019aa0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1019ac0 flags:0x00000060 mapping:ce519d2d mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c12527a0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1015e00 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1015e20 flags:0x0001006c mapping:d8dba784 mapcount:1 count:2 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1232000 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1232020 flags:0x00000060 mapping:c03a1da5 mapcount:3 count:3 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c11501c0 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c11501e0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c12b6b60 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1045c40 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1045c60 flags:0x00000060 mapping:ca5a3c61 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c11ca940 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c11ca960 flags:0x00000000 mapping:00000000 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1292660 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1292680 flags:0x00000080 mapping:00000000 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c127cc00 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c127cc20 flags:0x0001006c mapping:caa6ba44 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1250d80 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1250da0 flags:0x00010068 mapping:caa6b8c4 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c131a1c0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c131a1e0 flags:0x00000080 mapping:00000000 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c12bbbe0 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c106eb60 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c106eb80 flags:0x00000080 mapping:00000000 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1108020 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1108040 flags:0x00000060 mapping:c408f7d5 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1227b00 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1227b20 flags:0x00000060 mapping:c03a15f5 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10e4400 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10e4420 flags:0x00000080 mapping:00000000 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1093480 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c10934a0 flags:0x00010068 mapping:caa6b8c4 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1231360 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1231380 flags:0x00000060 mapping:c6f7a469 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c12a4ca0 flags:0x0000006c mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c12a4cc0 flags:0x0001006c mapping:cfa4f8c4 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c11f63e0 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c11f6400 flags:0x00000060 mapping:ce5198a1 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c125f300 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c125f320 flags:0x00000060 mapping:c03a1da5 mapcount:1 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c1305780 flags:0x00000068 mapping:c09b1698 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: Bad page state in process 'sa-learn' page:c13057a0 flags:0x00000080 mapping:00000000 mapcount:0 count:1 Trying to fix it up, but a reboot is needed Backtrace: ------------[ cut here ]------------ kernel BUG at mm/page_alloc.c:367! invalid opcode: 0000 [#1] SMP last sysfs file: /devices/xen/vif-1/net/eth1/type Modules linked in: iptable_filter ip_tables ip6_tables x_tables 8250 serial_core apparmor nls_utf8 loop dm_mod xfs xenblk xennet CPU: 0 EIP: 0061:[<c014a259>] Tainted: G B VLI EFLAGS: 00010002 (2.6.22.16-0.1-xenpae #1) EIP is at free_pages_bulk+0xf3/0x1a9 eax: c12c3a20 ebx: c12c3a20 ecx: 00000000 edx: 00000000 esi: 00000000 edi: 000001d0 ebp: c12c3a00 esp: cdec9d54 ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0069 Process imapd (pid: 26762, ti=cdec8000 task=ca59aab0 task.ti=cdec8000) Stack: 00000001 c034860c 0000001c c0348580 00000000 00000001 c127f400 c0348600 c0348580 00000000 c014a5f2 00000000 001e8000 74d25067 c1335084 00000001 801e8000 c01528f8 74d25067 80000000 c0b810c0 c79b1fcc c03a8084 c1335094 Call Trace: [<c014a5f2>] free_hot_cold_page+0x13e/0x18e [<c01528f8>] unmap_vmas+0x66d/0x8da [<c0157c1a>] exit_mmap+0x74/0xe5 [<c011b05f>] mmput+0x21/0x78 [<c011fc95>] do_exit+0x1f3/0x755 [<c0125d88>] __dequeue_signal+0xd7/0x11c [<c0120282>] sys_exit_group+0x0/0xd [<c01279d0>] get_signal_to_deliver+0x3d0/0x410 [<c0103f1b>] do_notify_resume+0x84/0x641 [<c01c80f7>] copy_to_user+0x25/0x3a [<c01284f3>] atomic_notifier_call_chain+0x17/0x1a [<c0113630>] do_page_fault+0xb55/0xb5d [<c01699a1>] sys_stat64+0x1e/0x23 [<c0112adb>] do_page_fault+0x0/0xb5d [<c01049cd>] work_notifysig+0x13/0x1a ======================= Code: 8b 03 a9 00 00 08 00 74 2b 8b 53 0c 39 f2 89 54 24 10 75 20 81 e1 00 40 02 00 89 d0 81 f9 00 40 02 00 0f 45 c3 83 78 04 00 74 62 <0f> 0b eb fe 83 fe 09 76 a3 89 75 0c 6b de 0c 0f ba 6d 00 13 8b EIP: [<c014a259>] free_pages_bulk+0xf3/0x1a9 SS:ESP 0069:cdec9d54 Fixing recursive fault but reboot is needed! BUG: unable to handle kernel paging request at virtual address 14000060 printing eip: c0188f86 028f8000 -> *pde = 00000000:7d01d027 0bca8000 -> *pme = 00000000:00000000 Oops: 0000 [#2] SMP last sysfs file: /devices/xen/vif-1/net/eth1/type Modules linked in: iptable_filter ip_tables ip6_tables x_tables 8250 serial_core apparmor nls_utf8 loop dm_mod xfs xenblk xennet CPU: 0 EIP: 0061:[<c0188f86>] Tainted: G B VLI EFLAGS: 00210286 (2.6.22.16-0.1-xenpae #1) EIP is at do_mpage_readpage+0x4f/0x60d eax: 14000000 ebx: da12efc7 ecx: cec95d8c edx: c127cc20 esi: c127cc20 edi: 00000000 ebp: 000002a4 esp: cec95c74 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0069 Process antivir (pid: 2951, ti=cec94000 task=c9148ab0 task.ti=cec94000) Stack: 6b1ed067 00000001 00000000 c5009818 c89155c0 d70bbb24 d70bbb00 c0261f45 da12efc7 cec95d8c cec95d4c cec95d84 00000001 c127cc20 00000000 14000000 0000001f caa6ba44 c014c410 00000001 caa6ba48 c1b6fc00 caa6ba54 00000445 Call Trace: [<c0261f45>] sock_def_readable+0x39/0x63 [<da12efc7>] xfs_get_blocks+0x0/0x2d [xfs] [<c014c410>] __do_page_cache_readahead+0x93/0x231 [<c0260955>] sk_reset_timer+0xc/0x16 [<c0294bcc>] tcp_rcv_established+0x76d/0x7e5 [<da10e4f3>] xfs_iext_bno_to_ext+0xdb/0x195 [xfs] [<c018958f>] mpage_readpage+0x4b/0x5e [<da12efc7>] xfs_get_blocks+0x0/0x2d [xfs] [<c01213cb>] current_fs_time+0x41/0x46 [<c0147317>] do_generic_mapping_read+0x232/0x43a [<c0148efd>] generic_file_aio_read+0x145/0x16c [<c0146819>] file_read_actor+0x0/0xd1 [<da1357f1>] xfs_read+0x2c5/0x327 [xfs] [<da132446>] xfs_file_aio_read+0x66/0x70 [xfs] [<c0166a93>] do_sync_read+0x0/0x10a [<c0166b5a>] do_sync_read+0xc7/0x10a [<c012e001>] autoremove_wake_function+0x0/0x33 [<c02bccf5>] __sched_text_start+0x6f5/0x7d5 [<c0142bb4>] handle_IRQ_event+0x38/0x70 [<c0166a93>] do_sync_read+0x0/0x10a [<c016734a>] vfs_read+0xa6/0x128 [<c0167722>] sys_read+0x41/0x67 [<c01048ee>] syscall_call+0x7/0xb ======================= Code: 8b 84 24 c8 00 00 00 8b 8c 24 d0 00 00 00 89 54 24 28 8b 54 24 34 89 5c 24 20 89 44 24 2c 89 4c 24 24 8b 42 10 8b 00 89 44 24 3c <8b> 48 60 89 4c 24 40 8b 02 f6 c4 08 0f 85 67 05 00 00 8b 72 14 EIP: [<c0188f86>] do_mpage_readpage+0x4f/0x60d SS:ESP 0069:cec95c74 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=358531
Charles Arnold
https://bugzilla.novell.com/show_bug.cgi?id=358531
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c2
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c3
--- Comment #3 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c4
--- Comment #4 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c5
Ralf Müller
The log you provided is missing a lot of information - it the information managed to get to /var/log/messages, the we'll need the respective fragment of that file, otherwise you'll need to increase the log level to at least 4 in the affected DomU.
This has been the output of "xm dmesg"
Further it'll be necessary to see any Xen messages, which you'll need to collect over serial or obtain from 'xm dmesg'; please make sure you use loglvl=all guest_loglvl=all on the Xen command line.
Where exactly do I have to enter "loglvl ...": /boot/grub/menu.lst /etc/xen/vm/smtp ... Do I get flooded with log messages when I do this and is it possible to limit this setting to the affected domU?
Please also provide basic configuration information about Dom0 (hwinfo, list of loaded modules will probably suffice for a first step).
done.
Finally it'd be helpful to know common as well as distinguishing factors between the affected and non-affected DomU-s, plus what exactly the DomU was doing immediately before the first (few) of those 'Bad page state ...' messages appeared (if that is known or can be determined).
I notice these locks when our smtp/imap server stops responding. So actually I don't know what the domU in the time before lock but: This domU is dedicated to serve smtp and imap. It does virus scanning and SPAM detection. It is doing nothing else. I have the feeling this lockups happen on heavy random disk access - but I can't prove that. The "bad page state" messages mostly happen for "sa-learn" processes - this would be spam bayes learning of spamassasin.
(Please attach larger pieces of information rather than providing them inline.)
sorry for inconvenience. More info on request. Ralf -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=358531
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c6
Jan Beulich
This has been the output of "xm dmesg".
Okay, then the loglvl would need to be increased. But this seems odd to me - I don't think DomU output goes to the Xen log (as a guest could then maliciously flush any important messages in there).
Where exactly do I have to enter "loglvl ...": /boot/grub/menu.lst
If the DomU output really ends up in the hypervisor log, then it'll be necessary to collect the output through serial (the log buffer itself is not big enough to retain much data). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c7
--- Comment #7 from Ralf Müller
This has been the output of "xm dmesg".
Okay, then the loglvl would need to be increased. But this seems odd to me - I don't think DomU output goes to the Xen log (as a guest could then maliciously flush any important messages in there).
Ahem sorry - it has not been xm dmesg - this was xm console output. I got a further lock today - console/dmesg log will be attached.
Where exactly do I have to enter "loglvl ...": /boot/grub/menu.lst
If the DomU output really ends up in the hypervisor log, then it'll be necessary to collect the output through serial (the log buffer itself is not big enough to retain much data).
Mh ... serial console will be difficult. The only serial port is occupied by a modem which usually is in use ... will see what I can do ... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c8
--- Comment #8 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c9
--- Comment #9 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c10
--- Comment #10 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c11
--- Comment #11 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c12
--- Comment #12 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c13
--- Comment #13 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c14
--- Comment #14 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c15
--- Comment #15 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=358531
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c16
--- Comment #16 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c17
--- Comment #17 from Ralf Müller
This information (#14) is again truncated at the beginning (#8 and #13 are likewise too little information). We really need to see the beginning of the problems, i.e. we need the full DomU messages from boot (include /var/log/boot.msg) till the crash.
I will attach a full /var/log/messages from boot and /var/log/boot.msg of the domU after the next lock. For using a different filesystem - if this is not absolutely necessary I would try to avoid such modifications. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c18
Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c19
--- Comment #19 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c20
--- Comment #20 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c21
Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c22
Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c23
--- Comment #23 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c24
--- Comment #24 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c25
--- Comment #25 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c26
--- Comment #26 from Ralf Müller
https://bugzilla.novell.com/show_bug.cgi?id=358531
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c27
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c28
--- Comment #28 from Ralf Müller
Hmm, the 'Bad page state ...' messages seem to no longer occur, which is odd given the huge amount of them you saw earlier.
The last two crashes look almost identical, so posting more of these won't be needed (if one with another pattern happens, it should of course be added here).
Ok - so I will do. Actually I already did :)
I noticed you run the VM with just 400Mb - did (or can) you try with more memory, and if so, did (does) it make a difference?
After last lock (not reported because same pattern) I tried to move the VM into another memory region by creating a dummy VM that I have started (and memory trashed) before the "smtp" one. I don't know how memory management in xen is actually done - but I assumed VM's get a fixed part of physical memory assigned at startup or at least at first use. I tested memory when I set up this system without finding problems - but who knows. When this try doesn't help I will add memory to the VM and see what happens. Ralf -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=358531
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c29
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=358531
User ralf@bj-ig.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c30
--- Comment #30 from Ralf Müller
I think I identified a problem which would happen on PAE only and only when there is some memory pressure (which appears to be the case here).
That sounds good. So you say, as an intermediate solution I should really add some memory - I will. Thanks -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=358531
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c31
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=358531
User cthiel@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=358531#c32
Christoph Thiel
participants (1)
-
bugzilla_noreply@novell.com