Am Mittwoch, 16. Juni 2021, 11:16:50 CEST schrieb Thomas Hartwig:
thank you very much for the information and your work, I tested the
preempt kernel 5.12 and it is performing very well.
Glad to hear that. I tend to delay the kernel version bumps a bit, but please
use keeppackages=1 on that repo. "eg. zypper mr -k repo"
I think I can use these in production environments as
well. This is good.
Rest assured that when I know someone is using these builds in production, I
take extra care with them. My usual workflow is fetching from the opensuse
kernel, rebasing, and pushing to this project. The packages in my local OBS
are linked from there.
It would be nice to have some for distribution
releases 15.2/15.3 but as
long as the repository is isolated for the kernels only I can use the
Tumbleweed version for sure.
The Leap builds have different issues, that are manageable, but if the TW
build do for you, please go for it.
Meanwhile I want to give some more technical
background to the issues
monitored. Please note I am not a kernel developer nor an expert.
Instead I am an application developer with extended system knowledge and
sometimes vertical hardware diving but mainly I am locked to application
development, this is why I am limited in time and know-how what is going
on specific kernel versions.
The systems I work with are specialized video capturing systems working
with industrial high speed cameras based on GigE (Basler/Pylon software
driver). All this is TCP/IP stack based, so IMHO this is the most
critical section. The systems are storing video frames in memory and
these are further processed in complex multi-threaded applications.
Simply to give some numbers here which are really impressive what a
Linux system is capable of:
6 cameras are streaming at 4000 frames per second each with a bandwidth
of 100 MB/s which is handled by a fiber optic network card X710. So
roughly 600 MB/s are handled. The CPU is an Intel Xeon Silver 4216 16 core.
In the end the amount of cameras does matter to our problems witnessed
it can happen with simply 1 camera. The problems are not related to our
application since it can be reproduced with vendor original test
software (even without storing anything). So it is rather a latency than
a bandwidth problem. From time to time there are single frames lost when
the system is not configured for high speed aka CONFIG_HZ. Unfortunately
we can not track it easily and have not possibility to influence the
driver itself. But from our observation it is like the Linux system
takes some brakes from time to time to manage interrupt handlers for the
network stack. Then the camera buffers can not be processed by the
driver in time and are lost. I can not explain why 1000 MHz instead of
250 MHz is making this difference. We have tried to optimize almost all
others things thinkable like TCP buffers/network card driver and so on,
maybe IOWAIT is an issue to be considered further but we really do not
storing at all when we test...
Wow, that sounds like real fun. And bit dance.
This asks for playing with qdiscs and NIC offloading of course.
Probably not applicable, but wasn't SCTP invented for this area.
As I said I am limited in time to setup a complete
test environment and
going ground deep. All I can say the preempt kernel at 1000 MHz is
This is valuable feedback of course, even if not everyone likes it.
While preemption is on the way of getting a dynamic boot option, HZ isn't as
far as I know. Cannot imagine giving up on const'ness of HZ.
This is a strong vote of a separate preempt flavor.
However, you should test the official default (TW) kernel with the preempt
option, once it appears, as well, if time permits.
I hope this gives some insight and can be used as
feedback from the
field to all have worked in the Linux systems, thanks for this and
thanks Pete for making this kernel.
My pleasure and good luck, Thomas!