-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/14/2010 10:30 AM, Jeff Mahoney wrote:
On 12/14/2010 09:51 AM, Jeff Mahoney wrote:
On 12/14/2010 07:11 AM, Kay Sievers wrote:
On Tue, 2010-12-14 at 11:59 +0100, Peter Czanik wrote:
On 12/14/2010 11:42 AM, Kay Sievers wrote:
Without systemd the same system easily survives stress testing (cd /usr/src/linux && make -j 100 :-) ) without locking up.
Yeah, isolated computational load is a very different situation from booting up -- where we have heavy parallel kernel module load, device initialization, and service startup going on.
If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console.
It might slow down the bootup enough, so that it works-- or it might show where it hangs.
Hehe. Once I use the above settings, systemd seems to boot perfectly, at least for the last two boots. Without it the machine hangs with or without a kernel panic message on screen during the boot or right after the login: prompt is printed.
Yeah, a few people have seen this. It's likely a bug in the kernel in combination with some specific hardware.
Seems, the massive parallel work uncovers some races here, which we didn't trigger with the old bootup logic.
Is there any output on the console when the kernel panic happens? Can you take a picture with a camera of the screen? Would be good to find out which kernel module makes the machine crash.
Might also worth trying to add: systemd.unit=multi-user.target which will be "runlevel 3", and check if that already crashes.
I've seen a panic a bunch of times during startup and suspected it was related to the per-tty auto task groups patch. Mike removed that patch and replaced it with the per-session task groups patch which also fixes several bugs in the original patch. All of my crashes were related to cgroups or scheduling and they seem to have been fixed with the latest -rc5-based factory kernel.
Seems I spoke too soon. Here are the same Oopses I was seeing before.
... and it looks like Mike may be off the hook, but something is definitely wrong. I tested without that patch and ran into more crashes: [ 113.620493] BUG: unable to handle kernel NULL pointer dereference at 00000000000001f8 [ 113.621026] IP: [<ffffffff8125d258>] rb_erase+0x168/0x300 [ 113.621026] PGD 37aea067 PUD 37e03067 PMD 0 [ 113.621026] Oops: 0000 [#1] PREEMPT SMP [ 113.621026] last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/class [ 113.621026] CPU 0 [ 113.621026] Modules linked in: edd ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit af_packet ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables mperf 8139too ppdev sr_mod sg cdrom parport_pc 8139cp i2c_piix4 floppy button parport pcspkr autofs4 ext4 jbd2 crc16 dm_snapshot dm_mod fan processor pata_acpi thermal thermal_sys [ 113.621026] [ 113.621026] Pid: 3256, comm: systemd-cgroups Not tainted 2.6.37-rc5-desktop #14 /Bochs [ 113.621026] RIP: 0010:[<ffffffff8125d258>] [<ffffffff8125d258>] rb_erase+0x168/0x300 [ 113.621026] RSP: 0018:ffff880037c2f950 EFLAGS: 00010002 [ 113.621026] RAX: 00000000000001f8 RBX: ffff88003789ba10 RCX: 0000000000000001 [ 113.621026] RDX: ffff880037fcba10 RSI: ffff88003c2ee9e8 RDI: ffff88003c46dc10 [ 113.621026] RBP: ffff88003c2ee9e8 R08: ffff88003c46dc10 R09: 0000000000000000 [ 113.621026] R10: 00007fff8e1c3da0 R11: 0000000000000001 R12: ffff88003c2ee9c0 [ 113.621026] R13: ffff88003c46d610 R14: 0000000000000000 R15: ffff88003fc12640 [ 113.621026] FS: 00007fec0656c7a0(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 113.621026] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 113.621026] CR2: 00000000000001f8 CR3: 0000000037eda000 CR4: 00000000000006f0 [ 113.621026] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 113.621026] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 113.621026] Process systemd-cgroups (pid: 3256, threadinfo ffff880037c2e000, task ffff880037f8e3c0) [ 113.621026] Stack: [ 113.621026] ffff88003c46d600 ffff880037c2f998 ffffffff8104d8b3 ffff88003fc12640 [ 113.621026] ffff88003fc12640 ffff880037f8e7a8 0000000000000000 ffffffff8160b120 [ 113.621026] ffff880037c2fb88 ffff880037c2fa68 ffffffff815111af 0000000081304ef7 [ 113.621026] Call Trace: [ 113.621026] [<ffffffff8104d8b3>] pick_next_task_fair+0x143/0x180 [ 113.621026] [<ffffffff815111af>] thread_return+0x430/0x6a1 [ 113.621026] [<ffffffff8151277d>] schedule_hrtimeout_range_clock+0x14d/0x170 [ 113.621026] [<ffffffff8115e164>] poll_schedule_timeout+0x44/0x60 [ 113.621026] [<ffffffff8115f5ab>] do_sys_poll+0x34b/0x460 [ 113.621026] [<ffffffff8115f791>] sys_poll+0x71/0x110 [ 113.621026] [<ffffffff81002f4b>] system_call_fastpath+0x16/0x1b [ 113.621026] [<00007fec05840358>] 0x7fec05840358 [ 113.621026] Code: 84 6e 01 00 00 48 89 50 08 e9 f2 fe ff ff 0f 1f 44 00 00 48 8b 7b 08 48 8b 07 a8 01 0f 84 c9 00 00 00 48 8b 47 10 48 85 c0 74 09 <f6> 00 01 0f 84 5a 01 00 00 48 8b 47 08 48 85 c0 0f 84 75 ff ff [ 113.621026] RIP [<ffffffff8125d258>] rb_erase+0x168/0x300 [ 113.621026] RSP <ffff880037c2f950> [ 113.621026] CR2: 00000000000001f8 [ 113.621026] ---[ end trace 1f06837870217be7 ]--- [ 113.621026] note: systemd-cgroups[3256] exited with preempt_count 2 This one looks like memory corruption: [ 2.711048] BUG: unable to handle kernel paging request at 00000005000504aa [ 2.712021] IP: [<ffffffff8104cccf>] put_prev_task_fair+0x9f/0x110 [ 2.712021] PGD 378c4067 PUD 0 [ 2.712021] Oops: 0000 [#1] PREEMPT SMP [ 2.712021] last sysfs file: /sys/module/ipv6/parameters/disable [ 2.712021] CPU 0 [ 2.712021] Modules linked in: autofs4 ext4 jbd2 crc16 dm_snapshot dm_mod fan processor pata_acpi thermal thermal_sys [ 2.712021] [ 2.712021] Pid: 277, comm: gzip Not tainted 2.6.37-rc5-desktop #14 /Bochs [ 2.712021] RIP: 0010:[<ffffffff8104cccf>] [<ffffffff8104cccf>] put_prev_task_fair+0x9f/0x110 [ 2.712021] RSP: 0000:ffff880037ff9cc8 EFLAGS: 00010006 [ 2.712021] RAX: 000000050005046a RBX: ffff880037c22600 RCX: 00000000001c29ea [ 2.712021] RDX: ffff880037f66018 RSI: ffff88003c2fd9e8 RDI: 0000000001d26001 [ 2.712021] RBP: ffff880037ff9cd8 R08: ffff880037f66010 R09: 0000000000000000 [ 2.712021] R10: ffff880037fdd5c0 R11: 0000000000000001 R12: ffff88003c2fd9c0 [ 2.712021] R13: 0000000000000000 R14: 00007fa608346000 R15: 0000000000000000 [ 2.712021] FS: 00007fa608aa7700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 2.712021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.712021] CR2: 00000005000504aa CR3: 0000000037ffd000 CR4: 00000000000006f0 [ 2.712021] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2.712021] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 2.712021] Process gzip (pid: 277, threadinfo ffff880037ff8000, task ffff880037c346c0) [ 2.712021] Stack: [ 2.712021] ffff88003fc12640 ffff880037c34ab0 ffff880037ff9da8 ffffffff81510b55 [ 2.712021] ffff880037c346c0 ffff880037e01a98 ffff880037e01ae8 ffff880037c346c0 [ 2.712021] ffff880037ff9fd8 ffff880037ff9fd8 ffff880037ff9fd8 ffff880037c34ab0 [ 2.712021] Call Trace: [ 2.712021] [<ffffffff81510b55>] schedule+0x1b5/0x3df [ 2.712021] [<ffffffff815115f9>] preempt_schedule_irq+0x39/0x60 [ 2.712021] [<ffffffff81514286>] retint_kernel+0x26/0x30 [ 2.712021] Code: 31 c0 48 89 f2 48 8b 80 50 08 00 00 48 89 43 68 49 8b 7c 24 20 48 29 f9 eb 09 66 90 48 8d 50 10 49 89 c0 48 8b 02 48 85 c0 74 19 <48> 8b 50 40 48 29 fa 48 39 d1 7c e5 48 8d 50 08 45 31 c9 eb e0 [ 2.712021] RIP [<ffffffff8104cccf>] put_prev_task_fair+0x9f/0x110 [ 2.712021] RSP <ffff880037ff9cc8> [ 2.712021] CR2: 00000005000504aa [ 2.712021] ---[ end trace 0651b7dd8d3daab4 ]--- [ 2.712021] note: gzip[277] exited with preempt_count 268435458 - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk0HlgsACgkQLPWxlyuTD7IikACdGeGYCX9SQRX17TPA9zggb0sN NTQAnRYrRsHKUkezsFssdlKS5oVCPcCG =CU/z -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org