New subject: [opensuse-factory] systemd

24 Dec 2010

      -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/24/2010 12:05 AM, Mike Galbraith wrote:
...
FYI, this isn't limited to openSUSE factory.  Peterz has a repeatable
testcase now (kvm image), and is tracing through it.  Systemd is
triggering a strange use after free cgroups problem.
Yep, but we knew that already. I was able to reproduce it with a vanilla
kernel with the desktop config. CONFIG_PREEMPT seemed to have caused the
difference.
...
In about 12 hours, I should have a copy of the thing to play with.
Hopefully, Peter will have it all figured out before that, as cgroup.c
is hard to read.
Even better.

Thanks for looking into this.

- -Jeff
...
-Mike
https://lkml.org/lkml/2010/6/29/22
On Thu, 2010-12-23 at 13:33 +0100, Peter Zijlstra wrote:
...
systemd-1       0d..1. 2070793us : sched_destroy_group: se: f69e43c0, load: 1024
 systemd-1       0d..1. 2070794us : sched_destroy_group: cfs_rq: f69e4720, nr: 1, load: 1024
 systemd-1       0d..1. 2070794us : __print_runqueue:  cfs_rq: f69e4720, nr: 1, load: 1024
 systemd-1       0d..1. 2070795us : __print_runqueue:  curr: (null)
 systemd-1       0d..1. 2070796us : __print_runqueue:  se: f6a8eb4c, comm: systemd-tmpfile/1243, load: 1024
 systemd-1       0d..1. 2070796us : _raw_spin_unlock_irqrestore <-sched_destroy_group
So somehow it manages to destroy a group with a task attached.
Its even weirder:
systemd-1       0d..1. 1663489us : sched_destroy_group: se: f69e7360, load: 1024
 systemd-1       0d..1. 1663489us : sched_destroy_group: cfs_rq: f69e72a0, nr: 1, load: 1024
 systemd-1       0d..1. 1663491us : __print_runqueue:  cfs_rq: f69e72a0, nr: 1, load: 1024, cgroup: /system/systemd-sysctl.service
 systemd-1       0d..1. 1663491us : __print_runqueue:  curr: (null)
 systemd-1       0d..1. 1663493us : __print_runqueue:  se: f69d95bc, comm: systemd-sysctl/1209, load: 1024, cgroup: /
 systemd-1       0d..1. 1663496us : do_invalid_op <-error_code
The task enqueued to the cfs_rq doesn't match the cgroup, the thing is,
I don't see a cpu_cgroup_attach/sched_move_task call in the log, nor
does a BUG_ON() validating the task's cgroup against the cfs_rq's cgroup
on account_entity_enqueue() trigger.
So it looks like a task changes cgroup without passing through the
cgroup_subsys::attach method, which afaict isn't supposed to happen.
- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/

iEYEARECAAYFAk0U6UMACgkQLPWxlyuTD7IBcQCfZFsaNG0N9HxKxPRwjbyydKxc
XqIAniqZ7HKSAF72pWeM8D0bmT2YtT3E
=LUzP
-----END PGP SIGNATURE-----
-- 
To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse-factory+help@opensuse.org

Re: [opensuse-factory] systemd

Jeff Mahoney

Mike Galbraith

Stefan Seyfried

Mike Galbraith

Jeff Mahoney

Peter Czanik

tags

participants (4)