[opensuse] Re: Jump: merge Leap and SLE kernel configs

10 Sep 2020

      On Thu, 10 Sep 2020 01:47:53 +0200,
L A Walsh wrote:
...
On 9/6/2020 11:52 PM, Takashi Iwai wrote:
...
If PREEMPT_NONE gave the lower throughput for a pure server task
(i.e. not watching video on that machine or doing interactive tasks
there), something must have been broken there and it should have been
fixed.

I think you misunderstand.
What is probably the most common thing a server does?
My guess would be that it serves disk space.
It's not that someone is watching video ON the MACHINE,
but that multi-media content is likely to be *stored* or *hosted*
on the server.  That and web-pages, dns-lookups for example.
What does a server do, in your view?  In my view it "serves".
It does tasks for users who wouldn't have media or web-sites on the
machine.  Ideally dns-lookups will take <10ms.
If you load a busy webpage, it can easily have 20-30 DNS lookups
or more.  Hopefully most would be in cache for commonly view sites --
but I've noticed alot of short-timeout DNS values that keep that content
from being cached for long periods of time.
But the main place I'd see non-preempt making a difference was
in large video streams hosted on the server (not being watched on the
server).
Well, that's of course a typical use case of servers, and I understand
it well, too.  The point is that PREEMPT_NONE was from SLE *Server*
product, and that's the business SUSE has been doing over decades, and
the config was chosen by the evaluation of various benchmarks and QAs
to cover this kind of use cases.

If you have really a performance problem with the setup and work with
PREEMPT_VOLUNTARY, it should be fixed; PREEMPT_VOLUNTARY isn't meant
as a solution from the beginning of its introduction.  That's a
difference from a full RT.
...
...
PREEMPT_VOLUNTARY has actually a significant negative performance
impact wrt throughput measured by the benchmarks.  That's the very
reason SLE uses this option for kernel-default.

I'm sure it depends on your workload.  If you are doing pure
batch and really not doing "server" tasks (serving resources to a user),
and just computing...I can see PREEMPT_NONE being best.
What type of workloads are you using for servers -- ones that
serve multiple users doing interactive tasks or ones that are mostly
compute-bound doing long and large running jobs?
...
Again, if anything goes wrong with PREEMPT_NONE for the typical server
usage, it must be fixed.  PREEMPT_VOLUNTARY is not meant as a
solution for such a problem.
---
  It may be down to definitions and how you define a server
and its workload.  A server performing the functions of serving
several or many interactive users needs more interactivity than
compute or long-running data-processing jobs.
Perhaps using VOLUNTARY, but reserve some servers for
compute tasks that won't service user requests.
Specifically using the "isolcpus=" boot param to isolate some
of the cpu's from "interruption noise" -- specifically the kernel
configurator says:
CONFIG_CPU_ISOLATION:
Make sure that CPUs running critical tasks are not disturbed by
any source of "noise" such as unbound workqueues, timers, kthreads...
Unbound jobs get offloaded to housekeeping CPUs. This is driven by
the "isolcpus=" boot parameter.
Perhaps reserving half for critical tasks so they are not disturbed
and the remainder for interactive serving might serve the purpose.
It sounds like not having compute-jobs interrupted is a key
requirement of what you are looking for?  In 'xconfig', it's right
below the Preemption model and has the benefit of being configurable
per machine.
Do you  have any links to the impact on throughput of the
Preemption Model? Vs. effects of cpu isolation and/or changes to the
timeslice values?
Odd -- I thought there was better support for changing timeslice
values, but all I see seems to have to do with a graphics chip (DRM_I915_TIMESLICE_DURATION).  Must be confusing it with the NT kernel
knobs for such.
Do you know of a test case that fails using PREEMPT
VOLUNTARY vs. NONE or where it's seen that one performs significantly
better under some workload?  How typical is the workload?
Note: PREEMPT NONE is for kernels used for scientific computation
that needs the raw processing power of the kernel, regardless of scheduling
latencies.  I'm not doing such work straight computing, but instead
use the server, mostly, for serving incoming network requests.
Depending on packet and request size, One(1) 10Gb connection can
saturate
a single cpu.
Supposedly RedHat disabled various C-States if you use 9K network
packets, as long latencies imposed by deeper power saving states
can theoretically cause some packet loss.  If the cpu is not preemptable,
wouldn't the same situation exist?  Conversely I see some disable Jumbo
packets, which, when compared to standard, causes network throughput
drop from 400-600MB/s down to 110MB/s.  That's unacceptable for
servers serving internal network requests.
How much does 'Preempt none' benefit what workload over voluntary?
This can be answered best by the performance team, who has been
carrying the tests and evaluations.

And I'm off from today, so any further reply will be delayed.

thanks,

Takashi

-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse+owner@opensuse.org