[opensuse-factory] systemd
Hello, What is the current status of systemd? Yesterday I wanted to give it a try because of the syslog-ng integration, but once I installed systemd and systemd-sysvinit, and used "init=/bin/systemd" I could not login to my machine any more. I get: "cannot make/remove an entry for the specified session" message on console. Ssh fails too: debug1: Next authentication method: keyboard-interactive Password: debug1: Authentication succeeded (keyboard-interactive). debug1: channel 0: new [client-session] debug1: Requesting no-more-sessions@openssh.com debug1: Entering interactive session. debug1: Sending environment. debug1: Sending env LANG = en_US.utf8 Last login: Tue Dec 14 08:38:27 2010 Have a lot of fun... debug1: client_input_channel_req: channel 0 rtype exit-status reply 0 debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0 debug1: channel 0: free: client-session, nchannels 1 Connection to 192.168.2.103 closed. Transferred: sent 2392, received 1960 bytes, in 0.3 seconds Bytes per second: sent 9094.5, received 7452.0 debug1: Exit status 254 czanik@czp-bigone:~> Is it a bug, or I did something wrong? Bye, CzP -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Tue, 2010-12-14 at 08:44 +0100, Peter Czanik wrote:
What is the current status of systemd?
It should just work. :)
Yesterday I wanted to give it a try because of the syslog-ng integration, but once I installed systemd and systemd-sysvinit, and used "init=/bin/systemd" I could not login to my machine any more. I get: "cannot make/remove an entry for the specified session" message on console.
Is it a bug, or I did something wrong?
It's probably the ntp init script bug, mangling the /proc mount: https://bugzilla.novell.com/show_bug.cgi?id=656509 Kay -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On 12/14/2010 09:20 AM, Kay Sievers wrote:
Is it a bug, or I did something wrong?
It's probably the ntp init script bug, mangling the /proc mount: https://bugzilla.novell.com/show_bug.cgi?id=656509
Thanks, removing ntp solved the problem. Bye, CzP -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On 12/14/2010 09:42 AM, Peter Czanik wrote:
On 12/14/2010 09:20 AM, Kay Sievers wrote:
Is it a bug, or I did something wrong?
It's probably the ntp init script bug, mangling the /proc mount: https://bugzilla.novell.com/show_bug.cgi?id=656509
Thanks, removing ntp solved the problem.
Well, one out of 5 boots seems to work. It just hangs or sometimes there is a kernel panic on screen. Once it mentioned syslog-ng another time systemd. I'll check how it goes if I change back to rsyslog... Bye, CzP -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hello, On 12/14/2010 09:52 AM, Peter Czanik wrote:
On 12/14/2010 09:42 AM, Peter Czanik wrote:
On 12/14/2010 09:20 AM, Kay Sievers wrote:
Is it a bug, or I did something wrong?
It's probably the ntp init script bug, mangling the /proc mount: https://bugzilla.novell.com/show_bug.cgi?id=656509
Thanks, removing ntp solved the problem.
Well, one out of 5 boots seems to work. It just hangs or sometimes there is a kernel panic on screen. Once it mentioned syslog-ng another time systemd. I'll check how it goes if I change back to rsyslog...
No, it's not syslog-ng specific, the machine hangs with rsyslog too. Bye, CzP -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Tue, 2010-12-14 at 10:08 +0100, Peter Czanik wrote:
Well, one out of 5 boots seems to work. It just hangs or sometimes there is a kernel panic on screen. Once it mentioned syslog-ng another time systemd. I'll check how it goes if I change back to rsyslog...
No, it's not syslog-ng specific, the machine hangs with rsyslog too.
If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console. Kay -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On 12/14/2010 10:18 AM, Kay Sievers wrote:
On Tue, 2010-12-14 at 10:08 +0100, Peter Czanik wrote:
Well, one out of 5 boots seems to work. It just hangs or sometimes there is a kernel panic on screen. Once it mentioned syslog-ng another time systemd. I'll check how it goes if I change back to rsyslog...
No, it's not syslog-ng specific, the machine hangs with rsyslog too.
If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console.
I just had a few more kernel panics mentioning different applications, so I suspect that it's rather something kernel related. Bye, CzP -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Tue, 2010-12-14 at 10:22 +0100, Peter Czanik wrote:
On 12/14/2010 10:18 AM, Kay Sievers wrote:
On Tue, 2010-12-14 at 10:08 +0100, Peter Czanik wrote:
Well, one out of 5 boots seems to work. It just hangs or sometimes there is a kernel panic on screen. Once it mentioned syslog-ng another time systemd. I'll check how it goes if I change back to rsyslog...
No, it's not syslog-ng specific, the machine hangs with rsyslog too.
If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console.
I just had a few more kernel panics mentioning different applications, so I suspect that it's rather something kernel related.
Yeah, could be. It is not unlikely, that the massive parallel startup makes such problems more apparent. A few others have reported similar issues. Kay -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hello, On 12/14/2010 10:27 AM, Kay Sievers wrote:
I just had a few more kernel panics mentioning different applications, so I suspect that it's rather something kernel related.
Yeah, could be. It is not unlikely, that the massive parallel startup makes such problems more apparent. A few others have reported similar issues.
Without systemd the same system easily survives stress testing (cd /usr/src/linux && make -j 100 :-) ) without locking up. Bye, CzP -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Tue, 2010-12-14 at 11:36 +0100, Peter Czanik wrote:
Hello,
On 12/14/2010 10:27 AM, Kay Sievers wrote:
I just had a few more kernel panics mentioning different applications, so I suspect that it's rather something kernel related.
Yeah, could be. It is not unlikely, that the massive parallel startup makes such problems more apparent. A few others have reported similar issues.
Without systemd the same system easily survives stress testing (cd /usr/src/linux && make -j 100 :-) ) without locking up.
Yeah, isolated computational load is a very different situation from booting up -- where we have heavy parallel kernel module load, device initialization, and service startup going on. If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console. It might slow down the bootup enough, so that it works-- or it might show where it hangs. Kay -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hello, On 12/14/2010 11:42 AM, Kay Sievers wrote:
On Tue, 2010-12-14 at 11:36 +0100, Peter Czanik wrote:
Hello,
On 12/14/2010 10:27 AM, Kay Sievers wrote:
I just had a few more kernel panics mentioning different applications, so I suspect that it's rather something kernel related.
Yeah, could be. It is not unlikely, that the massive parallel startup makes such problems more apparent. A few others have reported similar issues.
Without systemd the same system easily survives stress testing (cd /usr/src/linux && make -j 100 :-) ) without locking up.
Yeah, isolated computational load is a very different situation from booting up -- where we have heavy parallel kernel module load, device initialization, and service startup going on.
If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console.
It might slow down the bootup enough, so that it works-- or it might show where it hangs.
Hehe. Once I use the above settings, systemd seems to boot perfectly, at least for the last two boots. Without it the machine hangs with or without a kernel panic message on screen during the boot or right after the login: prompt is printed. Bye, CzP -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Tue, 2010-12-14 at 11:59 +0100, Peter Czanik wrote:
On 12/14/2010 11:42 AM, Kay Sievers wrote:
Without systemd the same system easily survives stress testing (cd /usr/src/linux && make -j 100 :-) ) without locking up.
Yeah, isolated computational load is a very different situation from booting up -- where we have heavy parallel kernel module load, device initialization, and service startup going on.
If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console.
It might slow down the bootup enough, so that it works-- or it might show where it hangs.
Hehe. Once I use the above settings, systemd seems to boot perfectly, at least for the last two boots. Without it the machine hangs with or without a kernel panic message on screen during the boot or right after the login: prompt is printed.
Yeah, a few people have seen this. It's likely a bug in the kernel in combination with some specific hardware. Seems, the massive parallel work uncovers some races here, which we didn't trigger with the old bootup logic. Is there any output on the console when the kernel panic happens? Can you take a picture with a camera of the screen? Would be good to find out which kernel module makes the machine crash. Might also worth trying to add: systemd.unit=multi-user.target which will be "runlevel 3", and check if that already crashes. Kay -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/14/2010 07:11 AM, Kay Sievers wrote:
On Tue, 2010-12-14 at 11:59 +0100, Peter Czanik wrote:
On 12/14/2010 11:42 AM, Kay Sievers wrote:
Without systemd the same system easily survives stress testing (cd /usr/src/linux && make -j 100 :-) ) without locking up.
Yeah, isolated computational load is a very different situation from booting up -- where we have heavy parallel kernel module load, device initialization, and service startup going on.
If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console.
It might slow down the bootup enough, so that it works-- or it might show where it hangs.
Hehe. Once I use the above settings, systemd seems to boot perfectly, at least for the last two boots. Without it the machine hangs with or without a kernel panic message on screen during the boot or right after the login: prompt is printed.
Yeah, a few people have seen this. It's likely a bug in the kernel in combination with some specific hardware.
Seems, the massive parallel work uncovers some races here, which we didn't trigger with the old bootup logic.
Is there any output on the console when the kernel panic happens? Can you take a picture with a camera of the screen? Would be good to find out which kernel module makes the machine crash.
Might also worth trying to add: systemd.unit=multi-user.target which will be "runlevel 3", and check if that already crashes.
I've seen a panic a bunch of times during startup and suspected it was related to the per-tty auto task groups patch. Mike removed that patch and replaced it with the per-session task groups patch which also fixes several bugs in the original patch. All of my crashes were related to cgroups or scheduling and they seem to have been fixed with the latest - -rc5-based factory kernel. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk0HhG4ACgkQLPWxlyuTD7ITaACeNLmgJOXSIl5efhttS8h/BBe3 7zkAn0R/jj67j+SFVRHT9eL+XQXigK+e =Z2C1 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/14/2010 09:51 AM, Jeff Mahoney wrote:
On 12/14/2010 07:11 AM, Kay Sievers wrote:
On Tue, 2010-12-14 at 11:59 +0100, Peter Czanik wrote:
On 12/14/2010 11:42 AM, Kay Sievers wrote:
Without systemd the same system easily survives stress testing (cd /usr/src/linux && make -j 100 :-) ) without locking up.
Yeah, isolated computational load is a very different situation from booting up -- where we have heavy parallel kernel module load, device initialization, and service startup going on.
If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console.
It might slow down the bootup enough, so that it works-- or it might show where it hangs.
Hehe. Once I use the above settings, systemd seems to boot perfectly, at least for the last two boots. Without it the machine hangs with or without a kernel panic message on screen during the boot or right after the login: prompt is printed.
Yeah, a few people have seen this. It's likely a bug in the kernel in combination with some specific hardware.
Seems, the massive parallel work uncovers some races here, which we didn't trigger with the old bootup logic.
Is there any output on the console when the kernel panic happens? Can you take a picture with a camera of the screen? Would be good to find out which kernel module makes the machine crash.
Might also worth trying to add: systemd.unit=multi-user.target which will be "runlevel 3", and check if that already crashes.
I've seen a panic a bunch of times during startup and suspected it was related to the per-tty auto task groups patch. Mike removed that patch and replaced it with the per-session task groups patch which also fixes several bugs in the original patch. All of my crashes were related to cgroups or scheduling and they seem to have been fixed with the latest -rc5-based factory kernel.
Seems I spoke too soon. Here are the same Oopses I was seeing before. The following occured during boot. [ 2.212682] BUG: unable to handle kernel paging request at 000000040005047a [ 2.213066] IP: [<ffffffff8125df63>] rb_next+0x23/0x50 [ 2.213066] PGD 0 [ 2.213066] Oops: 0000 [#1] PREEMPT SMP [ 2.213066] last sysfs file: /sys/module/ipv6/parameters/disable [ 2.213066] CPU 0 [ 2.213066] Modules linked in: autofs4 ext4 jbd2 crc16 dm_snapshot dm_mod fan processor pata_acpi thermal thermal_sys [ 2.213066] [ 2.213066] Pid: 282, comm: systemd-cgroups Not tainted 2.6.37-rc5-desktop #13 /Bochs [ 2.213066] RIP: 0010:[<ffffffff8125df63>] [<ffffffff8125df63>] rb_next+0x23/0x50 [ 2.213066] RSP: 0018:ffff88003bc5fda0 EFLAGS: 00010006 [ 2.213066] RAX: 000000040005046a RBX: ffff880037f2ba00 RCX: 0000007000000000 [ 2.213066] RDX: 0000007000000000 RSI: ffff880037f2ba00 RDI: ffff880037f2ba10 [ 2.213066] RBP: ffff88003bc5fdd8 R08: 0000000000000001 R09: 0000000000000000 [ 2.213066] R10: ffff880037949158 R11: 0000000000000001 R12: ffff88003bcb0900 [ 2.213066] R13: ffff880037f2ba10 R14: 0000000000000000 R15: ffff88003fc12640 [ 2.213066] FS: 00007f489101d7a0(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 2.213066] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 2.213066] CR2: 000000040005047a CR3: 0000000001a03000 CR4: 00000000000006f0 [ 2.213066] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2.213066] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 2.213066] Process systemd-cgroups (pid: 282, threadinfo ffff88003bc5e000, task ffff8800379fc6c0) [ 2.213066] Stack: [ 2.213066] ffffffff8104dc78 ffff88003fc12640 ffff88003fc12640 ffff8800379fcaa8 [ 2.213066] 0000000000000000 ffffffff8160b120 ffff8800379fc9b8 ffff88003bc5fea8 [ 2.213066] ffffffff81511c4f ffff880037e98bc0 0000000000000092 0000000300121a65 [ 2.213066] Call Trace: [ 2.213066] [<ffffffff8104dc78>] pick_next_task_fair+0x178/0x180 [ 2.213066] [<ffffffff81511c4f>] thread_return+0x430/0x6a1 [ 2.213066] [<ffffffff8105af0e>] do_exit+0x5fe/0x8d0 [ 2.213066] [<ffffffff8105b471>] do_group_exit+0x51/0xc0 [ 2.213066] [<ffffffff8105b4f2>] sys_exit_group+0x12/0x20 [ 2.213066] [<ffffffff81002f4b>] system_call_fastpath+0x16/0x1b [ 2.213066] [<00007f48902c9b28>] 0x7f48902c9b28 [ 2.213066] Code: 85 d2 75 f4 f3 c3 f3 c3 48 8b 17 31 c0 48 89 d1 48 83 e1 fc 48 39 cf 74 37 48 8b 47 08 48 85 c0 75 09 eb 1a 0f 1f 40 00 48 89 d0 <48> 8b 50 10 48 85 d2 75 f4 f3 c3 66 90 48 8b 11 48 89 cf 48 89 [ 2.213066] RIP [<ffffffff8125df63>] rb_next+0x23/0x50 [ 2.213066] RSP <ffff88003bc5fda0> [ 2.213066] CR2: 000000040005047a [ 2.213066] ---[ end trace d28fc3a29d424b70 ]--- ... and this one occured during reboot: [ 73.191301] general protection fault: 0000 [#1] PREEMPT SMP [ 73.192023] last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/class [ 73.192023] CPU 0 [ 73.192023] Modules linked in: edd ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit af_packet ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables mperf ppdev parport_pc parport sr_mod cdrom sg 8139too i2c_piix4 floppy pcspkr button 8139cp autofs4 ext4 jbd2 crc16 dm_snapshot dm_mod fan processor pata_acpi thermal thermal_sys [ 73.192023] [ 73.192023] Pid: 1, comm: systemd Not tainted 2.6.37-rc5-desktop #13 /Bochs [ 73.192023] RIP: 0010:[<ffffffff8125dc65>] [<ffffffff8125dc65>] rb_erase+0xd5/0x300 [ 73.192023] RSP: 0018:ffff88003ce63be0 EFLAGS: 00010086 [ 73.192023] RAX: f000ff53f000ff53 RBX: ffff88003c49ba10 RCX: ffff88003c49ba10 [ 73.192023] RDX: ffff88003c49ba10 RSI: ffff88003bcaf928 RDI: ffff880000000000 [ 73.192023] RBP: ffff88003bcaf928 R08: 0000000000000001 R09: ffff880037824400 [ 73.192023] R10: ffff88003c0884e0 R11: 0000000000000001 R12: ffff88003bcaf900 [ 73.192023] R13: ffff88003c49b210 R14: 0000000000000000 R15: ffff88003fc12640 [ 73.192023] FS: 00007f74804757a0(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 73.192023] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 73.192023] CR2: 00000000007c32b0 CR3: 000000003cbe8000 CR4: 00000000000006f0 [ 73.192023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 73.192023] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 73.192023] Process systemd (pid: 1, threadinfo ffff88003ce62000, task ffff88003ce60040) [ 73.192023] Stack: [ 73.192023] ffff88003c49b200 ffff88003ce63c28 ffffffff8104dc43 ffff88003fc12640 [ 73.192023] ffff88003fc12640 ffff88003ce60428 0000000000000000 ffffffff8160b120 [ 73.192023] ffff88003ce60040 ffff88003ce63cf8 ffffffff81511c4f ffff88003ce63d28 [ 73.192023] Call Trace: [ 73.192023] [<ffffffff8104dc43>] pick_next_task_fair+0x143/0x180 [ 73.192023] [<ffffffff81511c4f>] thread_return+0x430/0x6a1 [ 73.192023] [<ffffffff8151248d>] schedule_timeout+0x28d/0x310 [ 73.192023] [<ffffffff81511300>] wait_for_common+0xc0/0x150 [ 73.192023] [<ffffffff810cb831>] synchronize_rcu+0x41/0x50 [ 73.192023] [<ffffffff810a40d6>] cgroup_diput+0x36/0xf0 [ 73.192023] [<ffffffff81161001>] d_kill+0x41/0x70 [ 73.192023] [<ffffffff81161640>] dput+0x60/0x150 [ 73.192023] [<ffffffff8115a9f0>] do_rmdir+0xa0/0x130 [ 73.192023] [<ffffffff81002f4b>] system_call_fastpath+0x16/0x1b [ 73.192023] [<00007f747ed05b67>] 0x7f747ed05b67 [ 73.192023] Code: 83 f8 01 74 55 5b 5d c3 48 83 c8 01 48 89 ee 48 89 07 48 83 23 fe 48 89 df e8 68 fd ff ff 48 8b 7b 10 48 8b 47 10 48 85 c0 74 09 <f6> 00 01 0f 84 9b 01 00 00 48 8b 57 08 48 85 d2 74 0c 48 8b 0a [ 73.192023] RIP [<ffffffff8125dc65>] rb_erase+0xd5/0x300 [ 73.192023] RSP <ffff88003ce63be0> [ 73.192023] ---[ end trace d05e11dcc0577c32 ]--- [ 73.192023] note: systemd[1] exited with preempt_count 2 - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk0HjZsACgkQLPWxlyuTD7LdJACfWnzEtVLGgU8eWc2LkXW6RjkW DwEAoItzf9zlYQizXEdgyQKXvODH/4YO =i3fJ -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/14/2010 10:30 AM, Jeff Mahoney wrote:
On 12/14/2010 09:51 AM, Jeff Mahoney wrote:
On 12/14/2010 07:11 AM, Kay Sievers wrote:
On Tue, 2010-12-14 at 11:59 +0100, Peter Czanik wrote:
On 12/14/2010 11:42 AM, Kay Sievers wrote:
Without systemd the same system easily survives stress testing (cd /usr/src/linux && make -j 100 :-) ) without locking up.
Yeah, isolated computational load is a very different situation from booting up -- where we have heavy parallel kernel module load, device initialization, and service startup going on.
If possible, try, if removing: quiet and adding: systemd.log_level=debug systemd.log_target=kmsg to the kernel commandline reveals something on the console.
It might slow down the bootup enough, so that it works-- or it might show where it hangs.
Hehe. Once I use the above settings, systemd seems to boot perfectly, at least for the last two boots. Without it the machine hangs with or without a kernel panic message on screen during the boot or right after the login: prompt is printed.
Yeah, a few people have seen this. It's likely a bug in the kernel in combination with some specific hardware.
Seems, the massive parallel work uncovers some races here, which we didn't trigger with the old bootup logic.
Is there any output on the console when the kernel panic happens? Can you take a picture with a camera of the screen? Would be good to find out which kernel module makes the machine crash.
Might also worth trying to add: systemd.unit=multi-user.target which will be "runlevel 3", and check if that already crashes.
I've seen a panic a bunch of times during startup and suspected it was related to the per-tty auto task groups patch. Mike removed that patch and replaced it with the per-session task groups patch which also fixes several bugs in the original patch. All of my crashes were related to cgroups or scheduling and they seem to have been fixed with the latest -rc5-based factory kernel.
Seems I spoke too soon. Here are the same Oopses I was seeing before.
... and it looks like Mike may be off the hook, but something is definitely wrong. I tested without that patch and ran into more crashes: [ 113.620493] BUG: unable to handle kernel NULL pointer dereference at 00000000000001f8 [ 113.621026] IP: [<ffffffff8125d258>] rb_erase+0x168/0x300 [ 113.621026] PGD 37aea067 PUD 37e03067 PMD 0 [ 113.621026] Oops: 0000 [#1] PREEMPT SMP [ 113.621026] last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/class [ 113.621026] CPU 0 [ 113.621026] Modules linked in: edd ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit af_packet ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables mperf 8139too ppdev sr_mod sg cdrom parport_pc 8139cp i2c_piix4 floppy button parport pcspkr autofs4 ext4 jbd2 crc16 dm_snapshot dm_mod fan processor pata_acpi thermal thermal_sys [ 113.621026] [ 113.621026] Pid: 3256, comm: systemd-cgroups Not tainted 2.6.37-rc5-desktop #14 /Bochs [ 113.621026] RIP: 0010:[<ffffffff8125d258>] [<ffffffff8125d258>] rb_erase+0x168/0x300 [ 113.621026] RSP: 0018:ffff880037c2f950 EFLAGS: 00010002 [ 113.621026] RAX: 00000000000001f8 RBX: ffff88003789ba10 RCX: 0000000000000001 [ 113.621026] RDX: ffff880037fcba10 RSI: ffff88003c2ee9e8 RDI: ffff88003c46dc10 [ 113.621026] RBP: ffff88003c2ee9e8 R08: ffff88003c46dc10 R09: 0000000000000000 [ 113.621026] R10: 00007fff8e1c3da0 R11: 0000000000000001 R12: ffff88003c2ee9c0 [ 113.621026] R13: ffff88003c46d610 R14: 0000000000000000 R15: ffff88003fc12640 [ 113.621026] FS: 00007fec0656c7a0(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 113.621026] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 113.621026] CR2: 00000000000001f8 CR3: 0000000037eda000 CR4: 00000000000006f0 [ 113.621026] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 113.621026] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 113.621026] Process systemd-cgroups (pid: 3256, threadinfo ffff880037c2e000, task ffff880037f8e3c0) [ 113.621026] Stack: [ 113.621026] ffff88003c46d600 ffff880037c2f998 ffffffff8104d8b3 ffff88003fc12640 [ 113.621026] ffff88003fc12640 ffff880037f8e7a8 0000000000000000 ffffffff8160b120 [ 113.621026] ffff880037c2fb88 ffff880037c2fa68 ffffffff815111af 0000000081304ef7 [ 113.621026] Call Trace: [ 113.621026] [<ffffffff8104d8b3>] pick_next_task_fair+0x143/0x180 [ 113.621026] [<ffffffff815111af>] thread_return+0x430/0x6a1 [ 113.621026] [<ffffffff8151277d>] schedule_hrtimeout_range_clock+0x14d/0x170 [ 113.621026] [<ffffffff8115e164>] poll_schedule_timeout+0x44/0x60 [ 113.621026] [<ffffffff8115f5ab>] do_sys_poll+0x34b/0x460 [ 113.621026] [<ffffffff8115f791>] sys_poll+0x71/0x110 [ 113.621026] [<ffffffff81002f4b>] system_call_fastpath+0x16/0x1b [ 113.621026] [<00007fec05840358>] 0x7fec05840358 [ 113.621026] Code: 84 6e 01 00 00 48 89 50 08 e9 f2 fe ff ff 0f 1f 44 00 00 48 8b 7b 08 48 8b 07 a8 01 0f 84 c9 00 00 00 48 8b 47 10 48 85 c0 74 09 <f6> 00 01 0f 84 5a 01 00 00 48 8b 47 08 48 85 c0 0f 84 75 ff ff [ 113.621026] RIP [<ffffffff8125d258>] rb_erase+0x168/0x300 [ 113.621026] RSP <ffff880037c2f950> [ 113.621026] CR2: 00000000000001f8 [ 113.621026] ---[ end trace 1f06837870217be7 ]--- [ 113.621026] note: systemd-cgroups[3256] exited with preempt_count 2 This one looks like memory corruption: [ 2.711048] BUG: unable to handle kernel paging request at 00000005000504aa [ 2.712021] IP: [<ffffffff8104cccf>] put_prev_task_fair+0x9f/0x110 [ 2.712021] PGD 378c4067 PUD 0 [ 2.712021] Oops: 0000 [#1] PREEMPT SMP [ 2.712021] last sysfs file: /sys/module/ipv6/parameters/disable [ 2.712021] CPU 0 [ 2.712021] Modules linked in: autofs4 ext4 jbd2 crc16 dm_snapshot dm_mod fan processor pata_acpi thermal thermal_sys [ 2.712021] [ 2.712021] Pid: 277, comm: gzip Not tainted 2.6.37-rc5-desktop #14 /Bochs [ 2.712021] RIP: 0010:[<ffffffff8104cccf>] [<ffffffff8104cccf>] put_prev_task_fair+0x9f/0x110 [ 2.712021] RSP: 0000:ffff880037ff9cc8 EFLAGS: 00010006 [ 2.712021] RAX: 000000050005046a RBX: ffff880037c22600 RCX: 00000000001c29ea [ 2.712021] RDX: ffff880037f66018 RSI: ffff88003c2fd9e8 RDI: 0000000001d26001 [ 2.712021] RBP: ffff880037ff9cd8 R08: ffff880037f66010 R09: 0000000000000000 [ 2.712021] R10: ffff880037fdd5c0 R11: 0000000000000001 R12: ffff88003c2fd9c0 [ 2.712021] R13: 0000000000000000 R14: 00007fa608346000 R15: 0000000000000000 [ 2.712021] FS: 00007fa608aa7700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 2.712021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.712021] CR2: 00000005000504aa CR3: 0000000037ffd000 CR4: 00000000000006f0 [ 2.712021] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2.712021] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 2.712021] Process gzip (pid: 277, threadinfo ffff880037ff8000, task ffff880037c346c0) [ 2.712021] Stack: [ 2.712021] ffff88003fc12640 ffff880037c34ab0 ffff880037ff9da8 ffffffff81510b55 [ 2.712021] ffff880037c346c0 ffff880037e01a98 ffff880037e01ae8 ffff880037c346c0 [ 2.712021] ffff880037ff9fd8 ffff880037ff9fd8 ffff880037ff9fd8 ffff880037c34ab0 [ 2.712021] Call Trace: [ 2.712021] [<ffffffff81510b55>] schedule+0x1b5/0x3df [ 2.712021] [<ffffffff815115f9>] preempt_schedule_irq+0x39/0x60 [ 2.712021] [<ffffffff81514286>] retint_kernel+0x26/0x30 [ 2.712021] Code: 31 c0 48 89 f2 48 8b 80 50 08 00 00 48 89 43 68 49 8b 7c 24 20 48 29 f9 eb 09 66 90 48 8d 50 10 49 89 c0 48 8b 02 48 85 c0 74 19 <48> 8b 50 40 48 29 fa 48 39 d1 7c e5 48 8d 50 08 45 31 c9 eb e0 [ 2.712021] RIP [<ffffffff8104cccf>] put_prev_task_fair+0x9f/0x110 [ 2.712021] RSP <ffff880037ff9cc8> [ 2.712021] CR2: 00000005000504aa [ 2.712021] ---[ end trace 0651b7dd8d3daab4 ]--- [ 2.712021] note: gzip[277] exited with preempt_count 268435458 - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk0HlgsACgkQLPWxlyuTD7IikACdGeGYCX9SQRX17TPA9zggb0sN NTQAnRYrRsHKUkezsFssdlKS5oVCPcCG =CU/z -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Kay Sievers schrieb:
On Tue, 2010-12-14 at 11:59 +0100, Peter Czanik wrote:
Hehe. Once I use the above settings, systemd seems to boot perfectly, at least for the last two boots. Without it the machine hangs with or without a kernel panic message on screen during the boot or right after the login: prompt is printed.
Yeah, a few people have seen this. It's likely a bug in the kernel in combination with some specific hardware.
Actually, I've seen such hangs with traditional init, and it looks I could remove them by removing HAL from boot. It sounds very likely to me that some kernel problem exists, as previously mentioned in this thread. (Also, I get a few kernel errors from bttv/v4l and mutex/locking, I guess the BKL removal still has its rough edges.) Robert Kaiser -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On 12/14/2010 03:52 AM, Peter Czanik wrote:
On 12/14/2010 09:42 AM, Peter Czanik wrote:
On 12/14/2010 09:20 AM, Kay Sievers wrote:
Is it a bug, or I did something wrong?
It's probably the ntp init script bug, mangling the /proc mount: https://bugzilla.novell.com/show_bug.cgi?id=656509
Thanks, removing ntp solved the problem.
Well, one out of 5 boots seems to work. It just hangs or sometimes there is a kernel panic on screen. Once it mentioned syslog-ng another time systemd. I'll check how it goes if I change back to rsyslog... Bye, CzP Sometimes I get a hangup when I915 tries to load booting in fail safe mode then back seems to fix the problem temporarily -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
participants (6)
-
Dale Ritchey
-
Jeff Mahoney
-
Jeff Mahoney
-
Kay Sievers
-
Peter Czanik
-
Robert Kaiser