Re: [opensuse-factory] systemd
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/24/2010 12:05 AM, Mike Galbraith wrote:
FYI, this isn't limited to openSUSE factory. Peterz has a repeatable testcase now (kvm image), and is tracing through it. Systemd is triggering a strange use after free cgroups problem.
Yep, but we knew that already. I was able to reproduce it with a vanilla kernel with the desktop config. CONFIG_PREEMPT seemed to have caused the difference.
In about 12 hours, I should have a copy of the thing to play with. Hopefully, Peter will have it all figured out before that, as cgroup.c is hard to read.
Even better. Thanks for looking into this. - -Jeff
-Mike https://lkml.org/lkml/2010/6/29/22
On Thu, 2010-12-23 at 13:33 +0100, Peter Zijlstra wrote:
systemd-1 0d..1. 2070793us : sched_destroy_group: se: f69e43c0, load: 1024 systemd-1 0d..1. 2070794us : sched_destroy_group: cfs_rq: f69e4720, nr: 1, load: 1024 systemd-1 0d..1. 2070794us : __print_runqueue: cfs_rq: f69e4720, nr: 1, load: 1024 systemd-1 0d..1. 2070795us : __print_runqueue: curr: (null) systemd-1 0d..1. 2070796us : __print_runqueue: se: f6a8eb4c, comm: systemd-tmpfile/1243, load: 1024 systemd-1 0d..1. 2070796us : _raw_spin_unlock_irqrestore <-sched_destroy_group
So somehow it manages to destroy a group with a task attached.
Its even weirder:
systemd-1 0d..1. 1663489us : sched_destroy_group: se: f69e7360, load: 1024 systemd-1 0d..1. 1663489us : sched_destroy_group: cfs_rq: f69e72a0, nr: 1, load: 1024 systemd-1 0d..1. 1663491us : __print_runqueue: cfs_rq: f69e72a0, nr: 1, load: 1024, cgroup: /system/systemd-sysctl.service systemd-1 0d..1. 1663491us : __print_runqueue: curr: (null) systemd-1 0d..1. 1663493us : __print_runqueue: se: f69d95bc, comm: systemd-sysctl/1209, load: 1024, cgroup: / systemd-1 0d..1. 1663496us : do_invalid_op <-error_code
The task enqueued to the cfs_rq doesn't match the cgroup, the thing is, I don't see a cpu_cgroup_attach/sched_move_task call in the log, nor does a BUG_ON() validating the task's cgroup against the cfs_rq's cgroup on account_entity_enqueue() trigger.
So it looks like a task changes cgroup without passing through the cgroup_subsys::attach method, which afaict isn't supposed to happen.
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk0U6UMACgkQLPWxlyuTD7IBcQCfZFsaNG0N9HxKxPRwjbyydKxc XqIAniqZ7HKSAF72pWeM8D0bmT2YtT3E =LUzP -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
participants (4)
-
Jeff Mahoney
-
Mike Galbraith
-
Peter Czanik
-
Stefan Seyfried