[opensuse] Re: Jump: merge Leap and SLE kernel configs
On 2020/09/03 10:14, Takashi Iwai wrote:
Actually it's the topic we've been evaluating and discussing internally in last weeks.
The current plan (or hope) is that we're going to unify the kconfigs. Most of the performance-related configs will be aligned with the SLE setup, while enabling the missing features on its top.
e.g. about the preemption, we can follow the SLE pattern, namely, kernel-default = PREEMPT_NONE (typically server usage) and kernel-preempt = PREEMPT_FULL (typically desktop usage) instead of a single kernel flavor for all.
---- What is the single flavor now? What flavor would you use on a server that serves files (I would not think that an uncommon usage for a server). I use my server to serv ALL of my files except programs. Looking through my videos, the most demanding I have now at 1080p with DTS HD 5.1 w/ multiaudio, needs 38Mbps. It is an 8-bit encoding I'd say that works comfortably with a PREEMPT_VOLUNTARY, however, about 7-10 years ago, **PREEMPT_NONE** **failed** **frequently**, even with the much less demanding streams available then. Of note -- I'm not sure how this interacts with the above, though I run a tickless clock, it still wants a clock Hz which I have set at 1000Hz. I don't have any of the newer 4K vids which would need about 4X the BW, not to mention the 8K video that Nvidia was just demoing with their latest 3xxx series cards. Additionally many newer videos have 10-12 bit color with 48-bits of color -- I'd expect that to go to 16bit within the next few-several years as a round-up of the 14bit range for the human eye. I really don't think the NONE config is going to be useful for most server configs unless they are pure "batch"-job machines with no interactive or user dependencies. As for PREEMPT_FULL -- I'm sure some scientific, maybe even gaming platform boxes might find a use for that -- but I'd tend to think PREEMPT_VOLUNTARY would work for most people in most cases. - -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Mon, 07 Sep 2020 00:49:49 +0200, L A Walsh wrote:
On 2020/09/03 10:14, Takashi Iwai wrote:
Actually it's the topic we've been evaluating and discussing internally in last weeks.
The current plan (or hope) is that we're going to unify the kconfigs. Most of the performance-related configs will be aligned with the SLE setup, while enabling the missing features on its top.
e.g. about the preemption, we can follow the SLE pattern, namely, kernel-default = PREEMPT_NONE (typically server usage) and kernel-preempt = PREEMPT_FULL (typically desktop usage) instead of a single kernel flavor for all.
---- What is the single flavor now?
kernel-default with PREEMPT_VOLUNTARY.
What flavor would you use on a server that serves files (I would not think that an uncommon usage for a server). I use my server to serv ALL of my files except programs. Looking through my videos, the most demanding I have now at 1080p with DTS HD 5.1 w/ multiaudio, needs 38Mbps. It is an 8-bit encoding I'd say that works comfortably with a PREEMPT_VOLUNTARY, however, about 7-10 years ago, **PREEMPT_NONE** **failed** **frequently**, even with the much less demanding streams available then. Of note -- I'm not sure how this interacts with the above, though I run a tickless clock, it still wants a clock Hz which I have set at 1000Hz.
If PREEMPT_NONE gave the lower throughput for a pure server task (i.e. not watching video on that machine or doing interactive tasks there), something must have been broken there and it should have been fixed.
I don't have any of the newer 4K vids which would need about 4X the BW, not to mention the 8K video that Nvidia was just demoing with their latest 3xxx series cards. Additionally many newer videos have 10-12 bit color with 48-bits of color -- I'd expect that to go to 16bit within the next few-several years as a round-up of the 14bit range for the human eye.
I really don't think the NONE config is going to be useful for most server configs unless they are pure "batch"-job machines with no interactive or user dependencies.
PREEMPT_VOLUNTARY has actually a significant negative performance impact wrt throughput measured by the benchmarks. That's the very reason SLE uses this option for kernel-default.
As for PREEMPT_FULL -- I'm sure some scientific, maybe even gaming platform boxes might find a use for that -- but I'd tend to think PREEMPT_VOLUNTARY would work for most people in most cases.
If you have only one choice, yes. But you'll have two choices, then the situation changes pretty much. Again, if anything goes wrong with PREEMPT_NONE for the typical server usage, it must be fixed. PREEMPT_VOLUNTARY is not meant as a solution for such a problem. thanks, Takashi -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 9/6/2020 11:52 PM, Takashi Iwai wrote:
If PREEMPT_NONE gave the lower throughput for a pure server task (i.e. not watching video on that machine or doing interactive tasks there), something must have been broken there and it should have been fixed.
I think you misunderstand. What is probably the most common thing a server does? My guess would be that it serves disk space. It's not that someone is watching video ON the MACHINE, but that multi-media content is likely to be *stored* or *hosted* on the server. That and web-pages, dns-lookups for example. What does a server do, in your view? In my view it "serves". It does tasks for users who wouldn't have media or web-sites on the machine. Ideally dns-lookups will take <10ms. If you load a busy webpage, it can easily have 20-30 DNS lookups or more. Hopefully most would be in cache for commonly view sites -- but I've noticed alot of short-timeout DNS values that keep that content from being cached for long periods of time. But the main place I'd see non-preempt making a difference was in large video streams hosted on the server (not being watched on the server).
PREEMPT_VOLUNTARY has actually a significant negative performance impact wrt throughput measured by the benchmarks. That's the very reason SLE uses this option for kernel-default.
--- I'm sure it depends on your workload. If you are doing pure batch and really not doing "server" tasks (serving resources to a user), and just computing...I can see PREEMPT_NONE being best. What type of workloads are you using for servers -- ones that serve multiple users doing interactive tasks or ones that are mostly compute-bound doing long and large running jobs?
Again, if anything goes wrong with PREEMPT_NONE for the typical server usage, it must be fixed. PREEMPT_VOLUNTARY is not meant as a solution for such a problem.
--- It may be down to definitions and how you define a server and its workload. A server performing the functions of serving several or many interactive users needs more interactivity than compute or long-running data-processing jobs. Perhaps using VOLUNTARY, but reserve some servers for compute tasks that won't service user requests. Specifically using the "isolcpus=" boot param to isolate some of the cpu's from "interruption noise" -- specifically the kernel configurator says: CONFIG_CPU_ISOLATION: Make sure that CPUs running critical tasks are not disturbed by any source of "noise" such as unbound workqueues, timers, kthreads... Unbound jobs get offloaded to housekeeping CPUs. This is driven by the "isolcpus=" boot parameter. Perhaps reserving half for critical tasks so they are not disturbed and the remainder for interactive serving might serve the purpose. It sounds like not having compute-jobs interrupted is a key requirement of what you are looking for? In 'xconfig', it's right below the Preemption model and has the benefit of being configurable per machine. Do you have any links to the impact on throughput of the Preemption Model? Vs. effects of cpu isolation and/or changes to the timeslice values? Odd -- I thought there was better support for changing timeslice values, but all I see seems to have to do with a graphics chip (DRM_I915_TIMESLICE_DURATION). Must be confusing it with the NT kernel knobs for such. Do you know of a test case that fails using PREEMPT VOLUNTARY vs. NONE or where it's seen that one performs significantly better under some workload? How typical is the workload? Note: PREEMPT NONE is for kernels used for scientific computation that needs the raw processing power of the kernel, regardless of scheduling latencies. I'm not doing such work straight computing, but instead use the server, mostly, for serving incoming network requests. Depending on packet and request size, One(1) 10Gb connection can saturate a single cpu. Supposedly RedHat disabled various C-States if you use 9K network packets, as long latencies imposed by deeper power saving states can theoretically cause some packet loss. If the cpu is not preemptable, wouldn't the same situation exist? Conversely I see some disable Jumbo packets, which, when compared to standard, causes network throughput drop from 400-600MB/s down to 110MB/s. That's unacceptable for servers serving internal network requests. How much does 'Preempt none' benefit what workload over voluntary? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, 10 Sep 2020 01:47:53 +0200, L A Walsh wrote:
On 9/6/2020 11:52 PM, Takashi Iwai wrote:
If PREEMPT_NONE gave the lower throughput for a pure server task (i.e. not watching video on that machine or doing interactive tasks there), something must have been broken there and it should have been fixed.
I think you misunderstand.
What is probably the most common thing a server does? My guess would be that it serves disk space.
It's not that someone is watching video ON the MACHINE, but that multi-media content is likely to be *stored* or *hosted* on the server. That and web-pages, dns-lookups for example.
What does a server do, in your view? In my view it "serves". It does tasks for users who wouldn't have media or web-sites on the machine. Ideally dns-lookups will take <10ms. If you load a busy webpage, it can easily have 20-30 DNS lookups or more. Hopefully most would be in cache for commonly view sites -- but I've noticed alot of short-timeout DNS values that keep that content from being cached for long periods of time.
But the main place I'd see non-preempt making a difference was in large video streams hosted on the server (not being watched on the server).
Well, that's of course a typical use case of servers, and I understand it well, too. The point is that PREEMPT_NONE was from SLE *Server* product, and that's the business SUSE has been doing over decades, and the config was chosen by the evaluation of various benchmarks and QAs to cover this kind of use cases. If you have really a performance problem with the setup and work with PREEMPT_VOLUNTARY, it should be fixed; PREEMPT_VOLUNTARY isn't meant as a solution from the beginning of its introduction. That's a difference from a full RT.
PREEMPT_VOLUNTARY has actually a significant negative performance impact wrt throughput measured by the benchmarks. That's the very reason SLE uses this option for kernel-default.
I'm sure it depends on your workload. If you are doing pure batch and really not doing "server" tasks (serving resources to a user), and just computing...I can see PREEMPT_NONE being best.
What type of workloads are you using for servers -- ones that serve multiple users doing interactive tasks or ones that are mostly compute-bound doing long and large running jobs?
Again, if anything goes wrong with PREEMPT_NONE for the typical server usage, it must be fixed. PREEMPT_VOLUNTARY is not meant as a solution for such a problem.
--- It may be down to definitions and how you define a server and its workload. A server performing the functions of serving several or many interactive users needs more interactivity than compute or long-running data-processing jobs.
Perhaps using VOLUNTARY, but reserve some servers for compute tasks that won't service user requests.
Specifically using the "isolcpus=" boot param to isolate some of the cpu's from "interruption noise" -- specifically the kernel configurator says:
CONFIG_CPU_ISOLATION:
Make sure that CPUs running critical tasks are not disturbed by any source of "noise" such as unbound workqueues, timers, kthreads... Unbound jobs get offloaded to housekeeping CPUs. This is driven by the "isolcpus=" boot parameter.
Perhaps reserving half for critical tasks so they are not disturbed and the remainder for interactive serving might serve the purpose.
It sounds like not having compute-jobs interrupted is a key requirement of what you are looking for? In 'xconfig', it's right below the Preemption model and has the benefit of being configurable per machine.
Do you have any links to the impact on throughput of the Preemption Model? Vs. effects of cpu isolation and/or changes to the timeslice values?
Odd -- I thought there was better support for changing timeslice values, but all I see seems to have to do with a graphics chip (DRM_I915_TIMESLICE_DURATION). Must be confusing it with the NT kernel knobs for such.
Do you know of a test case that fails using PREEMPT VOLUNTARY vs. NONE or where it's seen that one performs significantly better under some workload? How typical is the workload?
Note: PREEMPT NONE is for kernels used for scientific computation that needs the raw processing power of the kernel, regardless of scheduling latencies. I'm not doing such work straight computing, but instead use the server, mostly, for serving incoming network requests. Depending on packet and request size, One(1) 10Gb connection can saturate a single cpu.
Supposedly RedHat disabled various C-States if you use 9K network packets, as long latencies imposed by deeper power saving states can theoretically cause some packet loss. If the cpu is not preemptable, wouldn't the same situation exist? Conversely I see some disable Jumbo packets, which, when compared to standard, causes network throughput drop from 400-600MB/s down to 110MB/s. That's unacceptable for servers serving internal network requests.
How much does 'Preempt none' benefit what workload over voluntary?
This can be answered best by the performance team, who has been carrying the tests and evaluations. And I'm off from today, so any further reply will be delayed. thanks, Takashi -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (2)
-
L A Walsh
-
Takashi Iwai