On Thu, 10 Sep 2020 01:47:53 +0200, L A Walsh wrote:
On 9/6/2020 11:52 PM, Takashi Iwai wrote:
If PREEMPT_NONE gave the lower throughput for a pure server task (i.e. not watching video on that machine or doing interactive tasks there), something must have been broken there and it should have been fixed.
I think you misunderstand.
What is probably the most common thing a server does? My guess would be that it serves disk space.
It's not that someone is watching video ON the MACHINE, but that multi-media content is likely to be *stored* or *hosted* on the server. That and web-pages, dns-lookups for example.
What does a server do, in your view? In my view it "serves". It does tasks for users who wouldn't have media or web-sites on the machine. Ideally dns-lookups will take <10ms. If you load a busy webpage, it can easily have 20-30 DNS lookups or more. Hopefully most would be in cache for commonly view sites -- but I've noticed alot of short-timeout DNS values that keep that content from being cached for long periods of time.
But the main place I'd see non-preempt making a difference was in large video streams hosted on the server (not being watched on the server).
Well, that's of course a typical use case of servers, and I understand it well, too. The point is that PREEMPT_NONE was from SLE *Server* product, and that's the business SUSE has been doing over decades, and the config was chosen by the evaluation of various benchmarks and QAs to cover this kind of use cases. If you have really a performance problem with the setup and work with PREEMPT_VOLUNTARY, it should be fixed; PREEMPT_VOLUNTARY isn't meant as a solution from the beginning of its introduction. That's a difference from a full RT.
PREEMPT_VOLUNTARY has actually a significant negative performance impact wrt throughput measured by the benchmarks. That's the very reason SLE uses this option for kernel-default.
I'm sure it depends on your workload. If you are doing pure batch and really not doing "server" tasks (serving resources to a user), and just computing...I can see PREEMPT_NONE being best.
What type of workloads are you using for servers -- ones that serve multiple users doing interactive tasks or ones that are mostly compute-bound doing long and large running jobs?
Again, if anything goes wrong with PREEMPT_NONE for the typical server usage, it must be fixed. PREEMPT_VOLUNTARY is not meant as a solution for such a problem.
--- It may be down to definitions and how you define a server and its workload. A server performing the functions of serving several or many interactive users needs more interactivity than compute or long-running data-processing jobs.
Perhaps using VOLUNTARY, but reserve some servers for compute tasks that won't service user requests.
Specifically using the "isolcpus=" boot param to isolate some of the cpu's from "interruption noise" -- specifically the kernel configurator says:
CONFIG_CPU_ISOLATION:
Make sure that CPUs running critical tasks are not disturbed by any source of "noise" such as unbound workqueues, timers, kthreads... Unbound jobs get offloaded to housekeeping CPUs. This is driven by the "isolcpus=" boot parameter.
Perhaps reserving half for critical tasks so they are not disturbed and the remainder for interactive serving might serve the purpose.
It sounds like not having compute-jobs interrupted is a key requirement of what you are looking for? In 'xconfig', it's right below the Preemption model and has the benefit of being configurable per machine.
Do you have any links to the impact on throughput of the Preemption Model? Vs. effects of cpu isolation and/or changes to the timeslice values?
Odd -- I thought there was better support for changing timeslice values, but all I see seems to have to do with a graphics chip (DRM_I915_TIMESLICE_DURATION). Must be confusing it with the NT kernel knobs for such.
Do you know of a test case that fails using PREEMPT VOLUNTARY vs. NONE or where it's seen that one performs significantly better under some workload? How typical is the workload?
Note: PREEMPT NONE is for kernels used for scientific computation that needs the raw processing power of the kernel, regardless of scheduling latencies. I'm not doing such work straight computing, but instead use the server, mostly, for serving incoming network requests. Depending on packet and request size, One(1) 10Gb connection can saturate a single cpu.
Supposedly RedHat disabled various C-States if you use 9K network packets, as long latencies imposed by deeper power saving states can theoretically cause some packet loss. If the cpu is not preemptable, wouldn't the same situation exist? Conversely I see some disable Jumbo packets, which, when compared to standard, causes network throughput drop from 400-600MB/s down to 110MB/s. That's unacceptable for servers serving internal network requests.
How much does 'Preempt none' benefit what workload over voluntary?
This can be answered best by the performance team, who has been carrying the tests and evaluations. And I'm off from today, so any further reply will be delayed. thanks, Takashi -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org