Mailinglist Archive: opensuse-factory (508 mails)

< Previous Next >
Re: [opensuse-factory] systemd
  • From: Jeff Mahoney <jeffm@xxxxxxxx>
  • Date: Fri, 24 Dec 2010 13:41:08 -0500
  • Message-id: <4D14E944.6000009@xxxxxxxx>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/24/2010 12:05 AM, Mike Galbraith wrote:

FYI, this isn't limited to openSUSE factory. Peterz has a repeatable
testcase now (kvm image), and is tracing through it. Systemd is
triggering a strange use after free cgroups problem.

Yep, but we knew that already. I was able to reproduce it with a vanilla
kernel with the desktop config. CONFIG_PREEMPT seemed to have caused the
difference.

In about 12 hours, I should have a copy of the thing to play with.
Hopefully, Peter will have it all figured out before that, as cgroup.c
is hard to read.

Even better.

Thanks for looking into this.

- -Jeff

-Mike
https://lkml.org/lkml/2010/6/29/22

On Thu, 2010-12-23 at 13:33 +0100, Peter Zijlstra wrote:

systemd-1 0d..1. 2070793us : sched_destroy_group: se: f69e43c0, load:
1024
systemd-1 0d..1. 2070794us : sched_destroy_group: cfs_rq: f69e4720,
nr: 1, load: 1024
systemd-1 0d..1. 2070794us : __print_runqueue: cfs_rq: f69e4720, nr:
1, load: 1024
systemd-1 0d..1. 2070795us : __print_runqueue: curr: (null)
systemd-1 0d..1. 2070796us : __print_runqueue: se: f6a8eb4c, comm:
systemd-tmpfile/1243, load: 1024
systemd-1 0d..1. 2070796us : _raw_spin_unlock_irqrestore
<-sched_destroy_group

So somehow it manages to destroy a group with a task attached.

Its even weirder:

systemd-1 0d..1. 1663489us : sched_destroy_group: se: f69e7360, load:
1024
systemd-1 0d..1. 1663489us : sched_destroy_group: cfs_rq: f69e72a0,
nr: 1, load: 1024
systemd-1 0d..1. 1663491us : __print_runqueue: cfs_rq: f69e72a0, nr:
1, load: 1024, cgroup: /system/systemd-sysctl.service
systemd-1 0d..1. 1663491us : __print_runqueue: curr: (null)
systemd-1 0d..1. 1663493us : __print_runqueue: se: f69d95bc, comm:
systemd-sysctl/1209, load: 1024, cgroup: /
systemd-1 0d..1. 1663496us : do_invalid_op <-error_code

The task enqueued to the cfs_rq doesn't match the cgroup, the thing is,
I don't see a cpu_cgroup_attach/sched_move_task call in the log, nor
does a BUG_ON() validating the task's cgroup against the cfs_rq's cgroup
on account_entity_enqueue() trigger.

So it looks like a task changes cgroup without passing through the
cgroup_subsys::attach method, which afaict isn't supposed to happen.



- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/

iEYEARECAAYFAk0U6UMACgkQLPWxlyuTD7IBcQCfZFsaNG0N9HxKxPRwjbyydKxc
XqIAniqZ7HKSAF72pWeM8D0bmT2YtT3E
=LUzP
-----END PGP SIGNATURE-----
--
To unsubscribe, e-mail: opensuse-factory+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse-factory+help@xxxxxxxxxxxx

< Previous Next >
Follow Ups