http://bugzilla.suse.com/show_bug.cgi?id=954765 http://bugzilla.suse.com/show_bug.cgi?id=954765#c2 --- Comment #2 from Mel Gorman <mel.gorman@microfocus.com> --- (In reply to Dr. Werner Fink from comment #1)
Maybe it's worth to watch the talk about control groups at systemd.conf 2015 on
So I watched it but I still don't see why the activation of the resource controllers is necessary via Delegate=yes. Note that I see no problem with grouping related processes together, it's the controller activation I'm confused by. The talk opens by saying that cgroups are a means of hierarchically labelling processes and uses this to manage service lifetime. cgroups are primarily about resource control and guaranteeing of fairness which is what some of the controllers do. As a side-effect, this can map PIDs to services and systemd uses this for service managmement and notification about service shutdown. While I can see why the grouping of processes is desirable to cleanly startup and shutdown services, I cannot see why the controllers get activated even when full resource control is not required. It's known the resource control enforcement incurs an 83% penalty on a scheduler microbenchmark due to the cpuacct controller. I observed myself a case where dbench4 regressed 80% due to the blkio controller as the journalling kernel thread and IO submitter were in separate cgroups. Even if the unified hierarchy was fully in place (and I understand why the separate cgroup hierarchies is difficult), it would not remove the overhead. For example, any process in the user slice with the cpuacct controller activated is going to force all processes to update what is essentially global data (the cumulative cpu usage for all processes in the user slice). Multiple processes updating the same data incurs a high penalty due to cache misses. Even if the overhead was zero (and I don't know anyone who is working specifically on eliminating the overhead), there is a semantic difference when controllers are enabled. For example, the memory controller creates per-memcg LRU lists. In the event of global memory pressure, those lists are reclaimed proportionally to each other. Their existence alters the order memory is reclaimed in and now a relatively new process can get reclaimed prematurely because it's being aged relative to other older cgroups that are idle. Now I can see why such enabling resource control would be a good idea in some cases. For example, virtual machines or containers could justify being resource controlled to avoid interfering with each other but that's a special case. It seems like a very bad idea to incur the same overhead and semantic differences for a single user on a single machine running basic workloads that do not require strict resource control. -- You are receiving this mail because: You are on the CC list for the bug.