[opensuse-buildservice] Linux scheduler bugs
Just an FYI. This is an interesting/relevant paper review. I'd be curious to know how much OBS performance could be improved by implementing all of the fixes described. https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-c... -Archie -- Archie L. Cobbs -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Tuesday 2016-04-26 15:27, Archie Cobbs wrote:
This is an interesting/relevant paper review. I'd be curious to know how much OBS performance could be improved by implementing all of the fixes described.
https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-c...
Hm why don't you take the patch for a spin and try it out yourself? :) -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Tue, Apr 26, 2016 at 9:02 AM, Jan Engelhardt <jengelh@inai.de> wrote:
On Tuesday 2016-04-26 15:27, Archie Cobbs wrote:
This is an interesting/relevant paper review. I'd be curious to know how much OBS performance could be improved by implementing all of the fixes described.
https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-c...
Hm why don't you take the patch for a spin and try it out yourself? :)
If only I didn't have a zillion other things to do ... :) -AC -- Archie L. Cobbs -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Tue, Apr 26, 2016 at 10:02 AM, Jan Engelhardt <jengelh@inai.de> wrote:
On Tuesday 2016-04-26 15:27, Archie Cobbs wrote:
This is an interesting/relevant paper review. I'd be curious to know how much OBS performance could be improved by implementing all of the fixes described.
https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-c...
Hm why don't you take the patch for a spin and try it out yourself? :)
I finally took a look at the link. It seems most of the bugs are NUMA related. I don't have any NUMA machines. (Single socket 6-core CPUs are SMP, not NUMA.) Does OBS hardware include systems with more than 1 NUMA node. (lscpu | grep NUMA) I believe in the Intel world at least, all 6 core (12 thread) single socket (CPU) are all SMP, not NUMA. If you have multiple sockets (CPUs) you are definitely into NUMA (per my understanding). I'm not sure when NUMA kicks in for an individual Intel CPUs (socket) (but I know that it does eventually). Greg -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Tue, Apr 26, 2016 at 10:02 AM, Jan Engelhardt <jengelh@inai.de> wrote:
On Tuesday 2016-04-26 15:27, Archie Cobbs wrote:
This is an interesting/relevant paper review. I'd be curious to know how much OBS performance could be improved by implementing all of the fixes described.
https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-c...
Hm why don't you take the patch for a spin and try it out yourself? :)
I finally took a look at the link. It seems most of the bugs are NUMA related. I don't have any NUMA machines. (Single socket 6-core CPUs are SMP, not NUMA.) Does OBS hardware include systems with more than 1 NUMA node. (lscpu | grep NUMA) I believe in the Intel world at least, all 6 core (12 thread) single socket (CPU) are all SMP, not NUMA. If you have multiple sockets (CPUs) you are definitely into NUMA (per my understanding). I'm not sure when NUMA kicks in for an individual Intel CPUs (socket) (but I know that it does eventually). Greg -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Friday 2016-04-29 01:23, Greg Freemyer wrote:
I believe in the Intel world at least, all 6 core (12 thread) single socket (CPU) are all SMP, not NUMA.
You make it sound like NUMA machines are not SMP, which is almost universally not the case. Anyhow, SMP and NUMA are separate concepts.
If you have multiple sockets (CPUs) you are definitely into NUMA (per my understanding).
I would question that for FSB generations (era Pentium 2), but it is hard to find information on those long gone processors.
I'm not sure when NUMA kicks in for an individual Intel CPUs (socket) (but I know that it does eventually).
It is there whenever the machine description says there is. (See ACPI tables.) You can play around with that in VirtualBox IIRC. -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Mon, May 2, 2016 at 7:05 AM, Jan Engelhardt <jengelh@inai.de> wrote:
On Friday 2016-04-29 01:23, Greg Freemyer wrote:
I believe in the Intel world at least, all 6 core (12 thread) single socket (CPU) are all SMP, not NUMA.
You make it sound like NUMA machines are not SMP, which is almost universally not the case. Anyhow, SMP and NUMA are separate concepts.
Hmm.. I am surprised. I guess I assumed that "symmetric" and "non-uniform" were contradictions. But it is just a matter of terminology and I'm not the one defining them, so I won't argue. (I will go do some reading.)
If you have multiple sockets (CPUs) you are definitely into NUMA (per my understanding).
I would question that for FSB generations (era Pentium 2), but it is hard to find information on those long gone processors.
I'm happy to restrict my comment to Pentium 4 class CPUs or newer.
I'm not sure when NUMA kicks in for an individual Intel CPUs (socket) (but I know that it does eventually).
It is there whenever the machine description says there is. (See ACPI tables.) You can play around with that in VirtualBox IIRC.
But you didn't answer the fundamental question. The scheduler bugs seem to affect systems using the NUMA feature of the scheduler more than those that don't. lscpu | grep NUMA provides information about NUMA. Are the servers in the public instance of OBS using NUMA based hardware. i.e. Do they have more than one NUMA node? Greg -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Monday 2016-05-02 16:41, Greg Freemyer wrote:
The scheduler bugs seem to affect systems using the NUMA feature of the scheduler more than those that don't.
lscpu | grep NUMA
provides information about NUMA. Are the servers in the public instance of OBS using NUMA based hardware. i.e. Do they have more than one NUMA node?
Theoretical approach: A worker pool is something that is commonly scaled horizontally rather than vertically, usually because of cost per work and redundancy feature set. Finding a system with more than one node is therefore a bit unlikely, and one with more than two is virtually nonexistent in such a setting. Practical approach: Run a build job with lscpu in it, and find out. (Though, virtualization may mask the real value and always yield 1.) -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Mon, May 2, 2016 at 11:54 AM, Jan Engelhardt <jengelh@inai.de> wrote:
On Monday 2016-05-02 16:41, Greg Freemyer wrote:
The scheduler bugs seem to affect systems using the NUMA feature of the scheduler more than those that don't.
lscpu | grep NUMA
provides information about NUMA. Are the servers in the public instance of OBS using NUMA based hardware. i.e. Do they have more than one NUMA node?
Theoretical approach: A worker pool is something that is commonly scaled horizontally rather than vertically, usually because of cost per work and redundancy feature set. Finding a system with more than one node is therefore a bit unlikely, and one with more than two is virtually nonexistent in such a setting.
But, many of the physical machines are running 12 or more VMs. Not likely small machines. A few Intel based servers (build21, build24, and build27) currently are configured to run 16 VMs (per the status monitor display). PPC machines build67, build91 and build92 also have 16 VMs configured. I would not at all be surprised if those 6 machines had a NUMA architecture.
Practical approach: Run a build job with lscpu in it, and find out. (Though, virtualization may mask the real value and always yield 1.)
I'm assuming the virtualization would indeed mask the real value. Someone with SSH access to the physical machines would need to run the command "lscpu | grep NUMA". Greg -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Hi, Em Ter, 2016-04-26 às 08:27 -0500, Archie Cobbs escreveu:
Just an FYI.
This is an interesting/relevant paper review. I'd be curious to know how much OBS performance could be improved by implementing all of the fixes described.
https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted -cores/
From the Github project that host the patches:
The main point of our paper is to raise awareness about issues in the Linux scheduler. The provided patches fix the issues encountered with our workloads, but they are not intended as generic bug fixes. They may have unwanted side effects and result in performance loss or energy waste on your machine. I read that some distributions are already making changes to the scheduler to fix some of these problems [1]. Thus, maybe openSUSE community should also start to discuss about it. However, these patches, as they are, do not seem ready for production yet. ---- [1] https://www.phoronix.com/scan.php?page=news_item&px=Clear-Linux-Looks-At-Sch eduler Regards, Ronan Arraes Jardim Chagas -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
26.04.2016 16:27, Archie Cobbs пишет:
Just an FYI.
This is an interesting/relevant paper review. I'd be curious to know how much OBS performance could be improved by implementing all of the fixes described.
https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-c...
-Archie
Related: https://github.com/yandex/smart -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
participants (5)
-
Archie Cobbs
-
Greg Freemyer
-
Jan Engelhardt
-
Matwey V. Kornilov
-
Ronan Arraes Jardim Chagas