[my previous email to the mailing list got stuck waiting for moderator
so I am changing my From to suse.cz]
On Tue 28-08-18 08:52:58, Stefan Priebe - Profihost AG wrote:
Am 28.08.2018 um 08:25 schrieb Michal Hocko:
On Sat 25-08-18 09:54:34, Stefan Priebe -
Profihost AG wrote:
not sure if related but since upgrading from Kernel 4.4 to 4.12 based
SLES15 kernel i had two times the following traces while the system was
[245513.362669] kvm: page allocation stalls for 194572ms, order:9,
This is an THP allocation and from __GFP_DIRECT_RECLAIM I assume it is
within MADV_HUGEPAGE mapping. What is your defrag setting?
spending 194s reclaiming/compacting is definitely way too much for THP.
This looks like an issue reported by Andrea recently
We do not have a proper solution for these pathological cases yet
Yes this sounds pretty much exactly the issue i'm seeing on multiple kvm
hosts. So this is a regression in SLES15?
Well, spending so much time in the allocation is clearly undesirable.
# cat /sys/kernel/mm/transparent_hugepage/defrag
always defer defer+madvise [madvise] never
OK, so the immediate workaround would be to weaken the defrag mode to
defer. You will likely not get as many THP but stalls should be gone.
If you are brave enough and willing to play a bit then you can try to
apply the patch I was proposing https://marc.info/?l=linux-mm&m=153493606221018
I guess we will end up with a different solution in the end but seeing
how this behaves if you have a workload which triggers the bad behavior
would be valuable.
To unsubscribe, e-mail: opensuse-kernel+unsubscribe(a)opensuse.org
To contact the owner, e-mail: opensuse-kernel+owner(a)opensuse.org