Hi, Am 28.08.2018 um 08:25 schrieb Michal Hocko:
On Sat 25-08-18 09:54:34, Stefan Priebe - Profihost AG wrote:
Hello,
not sure if related but since upgrading from Kernel 4.4 to 4.12 based SLES15 kernel i had two times the following traces while the system was nearly unusable: [245513.362669] kvm: page allocation stalls for 194572ms, order:9, mode:0x4740ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
This is an THP allocation and from __GFP_DIRECT_RECLAIM I assume it is within MADV_HUGEPAGE mapping. What is your defrag setting? cat /sys/kernel/mm/transparent_hugepage/defrag
spending 194s reclaiming/compacting is definitely way too much for THP. This looks like an issue reported by Andrea recently http://lkml.kernel.org/r/20180820032204.9591-1-aarcange@redhat.com
We do not have a proper solution for these pathological cases yet though.
Yes this sounds pretty much exactly the issue i'm seeing on multiple kvm hosts. So this is a regression in SLES15? # cat /sys/kernel/mm/transparent_hugepage/defrag always defer defer+madvise [madvise] never Greets, Stefan -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org