[Bug 1206848] khugepaged at 100% CPU
https://bugzilla.suse.com/show_bug.cgi?id=1206848 https://bugzilla.suse.com/show_bug.cgi?id=1206848#c10 Vlastimil Babka <vbabka@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mgorman@suse.com Assignee|kernel-bugs@opensuse.org |vbabka@suse.com Flags|needinfo?(vbabka@suse.com) | --- Comment #10 from Vlastimil Babka <vbabka@suse.com> --- Thanks, I have found that compaction triggered from khugepaged to allocate a huge page is stuck in a loop of isolating the same ranges for migration over and over: khugepaged-106 [001] 35328.677790: mm_compaction_isolate_migratepages: range=(0x11ce00 ~ 0x11d000) nr_scanned=41 nr_taken=15 khugepaged-106 [001] 35328.677800: mm_compaction_isolate_migratepages: range=(0x11ce00 ~ 0x11d000) nr_scanned=41 nr_taken=5 khugepaged-106 [001] 35328.677805: mm_compaction_isolate_migratepages: range=(0x14ee00 ~ 0x14f000) nr_scanned=36 nr_taken=16 khugepaged-106 [001] 35328.677817: mm_compaction_isolate_migratepages: range=(0x14ee00 ~ 0x14f000) nr_scanned=36 nr_taken=16 khugepaged-106 [001] 35328.677828: mm_compaction_isolate_migratepages: range=(0x140600 ~ 0x140800) nr_scanned=56 nr_taken=24 khugepaged-106 [001] 35328.677845: mm_compaction_isolate_migratepages: range=(0x149800 ~ 0x149958) nr_scanned=51 nr_taken=32 khugepaged-106 [001] 35328.677867: mm_compaction_isolate_migratepages: range=(0x149958 ~ 0x149a00) nr_scanned=25 nr_taken=16 khugepaged-106 [001] 35328.677879: mm_compaction_isolate_migratepages: range=(0x146000 ~ 0x146200) nr_scanned=47 nr_taken=12 khugepaged-106 [001] 35328.677887: mm_compaction_isolate_migratepages: range=(0x151e00 ~ 0x152000) nr_scanned=34 nr_taken=1 khugepaged-106 [001] 35328.677890: mm_compaction_isolate_migratepages: range=(0x158400 ~ 0x158460) nr_scanned=39 nr_taken=32 khugepaged-106 [001] 35328.677913: mm_compaction_isolate_migratepages: range=(0x158460 ~ 0x1584d0) nr_scanned=36 nr_taken=32 khugepaged-106 [001] 35328.677935: mm_compaction_isolate_migratepages: range=(0x1584d0 ~ 0x158600) nr_scanned=22 nr_taken=16 khugepaged-106 [001] 35328.677947: mm_compaction_isolate_migratepages: range=(0x165200 ~ 0x165258) nr_scanned=45 nr_taken=32 khugepaged-106 [001] 35328.677969: mm_compaction_isolate_migratepages: range=(0x165258 ~ 0x165400) nr_scanned=43 nr_taken=25 khugepaged-106 [001] 35328.677986: mm_compaction_isolate_migratepages: range=(0x15b600 ~ 0x15b800) nr_scanned=50 nr_taken=8 khugepaged-106 [001] 35328.677992: mm_compaction_isolate_migratepages: range=(0x163400 ~ 0x163464) nr_scanned=42 nr_taken=32 khugepaged-106 [001] 35328.678015: mm_compaction_isolate_migratepages: range=(0x163464 ~ 0x163514) nr_scanned=34 nr_taken=32 khugepaged-106 [001] 35328.678039: mm_compaction_isolate_migratepages: range=(0x163514 ~ 0x163600) nr_scanned=106 nr_taken=92 khugepaged-106 [001] 35328.678099: mm_compaction_isolate_migratepages: range=(0x134000 ~ 0x1340c8) nr_scanned=40 nr_taken=32 khugepaged-106 [001] 35328.678121: mm_compaction_isolate_migratepages: range=(0x1340c8 ~ 0x134200) nr_scanned=54 nr_taken=25 khugepaged-106 [001] 35328.678139: mm_compaction_isolate_migratepages: range=(0x12a200 ~ 0x12a400) nr_scanned=40 nr_taken=25 khugepaged-106 [001] 35328.678155: mm_compaction_isolate_migratepages: range=(0x12a200 ~ 0x12a400) nr_scanned=40 nr_taken=15 khugepaged-106 [001] 35328.678168: mm_compaction_isolate_migratepages: range=(0x16de00 ~ 0x16e000) nr_scanned=75 nr_taken=47 khugepaged-106 [001] 35328.678200: mm_compaction_isolate_migratepages: range=(0x16a200 ~ 0x16a400) nr_scanned=88 nr_taken=29 khugepaged-106 [001] 35328.678219: mm_compaction_isolate_migratepages: range=(0x174a00 ~ 0x174c00) nr_scanned=32 nr_taken=8 khugepaged-106 [001] 35328.678227: mm_compaction_isolate_migratepages: range=(0x167400 ~ 0x1674e0) nr_scanned=55 nr_taken=32 khugepaged-106 [001] 35328.678250: mm_compaction_isolate_migratepages: range=(0x1674e0 ~ 0x167500) nr_scanned=32 nr_taken=32 khugepaged-106 [001] 35328.678271: mm_compaction_isolate_migratepages: range=(0x167500 ~ 0x167600) nr_scanned=7 nr_taken=0 khugepaged-106 [001] 35328.678272: mm_compaction_isolate_migratepages: range=(0x165000 ~ 0x165100) nr_scanned=48 nr_taken=32 khugepaged-106 [001] 35328.678295: mm_compaction_isolate_migratepages: range=(0x165100 ~ 0x1651f8) nr_scanned=43 nr_taken=32 khugepaged-106 [001] 35328.678316: mm_compaction_isolate_migratepages: range=(0x1651f8 ~ 0x165200) nr_scanned=8 nr_taken=8 khugepaged-106 [001] 35328.678323: mm_compaction_isolate_migratepages: range=(0x156e00 ~ 0x157000) nr_scanned=52 nr_taken=16 khugepaged-106 [001] 35328.678334: mm_compaction_isolate_migratepages: range=(0x148800 ~ 0x148a00) nr_scanned=33 nr_taken=9 khugepaged-106 [001] 35328.678341: mm_compaction_isolate_migratepages: range=(0x11ce00 ~ 0x11d000) nr_scanned=41 nr_taken=15 khugepaged-106 [001] 35328.678351: mm_compaction_isolate_migratepages: range=(0x11ce00 ~ 0x11d000) nr_scanned=41 nr_taken=5 All of the isolated pages fail migration. Normally the range should be linearly increasing, but here it's taking the ranges via fast_find_migrateblock(). This should not be looping forever as each range should be tried only once. However there's a commit in v6.1 - 7efc3b726103 ("mm/compaction: fix set skip in fast_find_migrateblock") that removes recording the information to not repeat a range from one place, and looks like it's possible under certain circumstances to avoid the recording in the other places, and that's probably happened for you. I'm building a kernel with that commit reverted so you can test and confirm whether that was it. I will post a link soon. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com