Vlastimil Babka changed bug 1206848
What Removed Added
CC   mgorman@suse.com
Assignee kernel-bugs@opensuse.org vbabka@suse.com
Flags needinfo?(vbabka@suse.com)  

Comment # 10 on bug 1206848 from
Thanks, I have found that compaction triggered from khugepaged to allocate a
huge page is stuck in a loop of isolating the same ranges for migration over
and over:

      khugepaged-106   [001] 35328.677790: mm_compaction_isolate_migratepages:
range=(0x11ce00 ~ 0x11d000) nr_scanned=41 nr_taken=15
      khugepaged-106   [001] 35328.677800: mm_compaction_isolate_migratepages:
range=(0x11ce00 ~ 0x11d000) nr_scanned=41 nr_taken=5
      khugepaged-106   [001] 35328.677805: mm_compaction_isolate_migratepages:
range=(0x14ee00 ~ 0x14f000) nr_scanned=36 nr_taken=16
      khugepaged-106   [001] 35328.677817: mm_compaction_isolate_migratepages:
range=(0x14ee00 ~ 0x14f000) nr_scanned=36 nr_taken=16
      khugepaged-106   [001] 35328.677828: mm_compaction_isolate_migratepages:
range=(0x140600 ~ 0x140800) nr_scanned=56 nr_taken=24
      khugepaged-106   [001] 35328.677845: mm_compaction_isolate_migratepages:
range=(0x149800 ~ 0x149958) nr_scanned=51 nr_taken=32
      khugepaged-106   [001] 35328.677867: mm_compaction_isolate_migratepages:
range=(0x149958 ~ 0x149a00) nr_scanned=25 nr_taken=16
      khugepaged-106   [001] 35328.677879: mm_compaction_isolate_migratepages:
range=(0x146000 ~ 0x146200) nr_scanned=47 nr_taken=12
      khugepaged-106   [001] 35328.677887: mm_compaction_isolate_migratepages:
range=(0x151e00 ~ 0x152000) nr_scanned=34 nr_taken=1
      khugepaged-106   [001] 35328.677890: mm_compaction_isolate_migratepages:
range=(0x158400 ~ 0x158460) nr_scanned=39 nr_taken=32
      khugepaged-106   [001] 35328.677913: mm_compaction_isolate_migratepages:
range=(0x158460 ~ 0x1584d0) nr_scanned=36 nr_taken=32
      khugepaged-106   [001] 35328.677935: mm_compaction_isolate_migratepages:
range=(0x1584d0 ~ 0x158600) nr_scanned=22 nr_taken=16
      khugepaged-106   [001] 35328.677947: mm_compaction_isolate_migratepages:
range=(0x165200 ~ 0x165258) nr_scanned=45 nr_taken=32
      khugepaged-106   [001] 35328.677969: mm_compaction_isolate_migratepages:
range=(0x165258 ~ 0x165400) nr_scanned=43 nr_taken=25
      khugepaged-106   [001] 35328.677986: mm_compaction_isolate_migratepages:
range=(0x15b600 ~ 0x15b800) nr_scanned=50 nr_taken=8
      khugepaged-106   [001] 35328.677992: mm_compaction_isolate_migratepages:
range=(0x163400 ~ 0x163464) nr_scanned=42 nr_taken=32
      khugepaged-106   [001] 35328.678015: mm_compaction_isolate_migratepages:
range=(0x163464 ~ 0x163514) nr_scanned=34 nr_taken=32
      khugepaged-106   [001] 35328.678039: mm_compaction_isolate_migratepages:
range=(0x163514 ~ 0x163600) nr_scanned=106 nr_taken=92
      khugepaged-106   [001] 35328.678099: mm_compaction_isolate_migratepages:
range=(0x134000 ~ 0x1340c8) nr_scanned=40 nr_taken=32
      khugepaged-106   [001] 35328.678121: mm_compaction_isolate_migratepages:
range=(0x1340c8 ~ 0x134200) nr_scanned=54 nr_taken=25
      khugepaged-106   [001] 35328.678139: mm_compaction_isolate_migratepages:
range=(0x12a200 ~ 0x12a400) nr_scanned=40 nr_taken=25
      khugepaged-106   [001] 35328.678155: mm_compaction_isolate_migratepages:
range=(0x12a200 ~ 0x12a400) nr_scanned=40 nr_taken=15
      khugepaged-106   [001] 35328.678168: mm_compaction_isolate_migratepages:
range=(0x16de00 ~ 0x16e000) nr_scanned=75 nr_taken=47
      khugepaged-106   [001] 35328.678200: mm_compaction_isolate_migratepages:
range=(0x16a200 ~ 0x16a400) nr_scanned=88 nr_taken=29
      khugepaged-106   [001] 35328.678219: mm_compaction_isolate_migratepages:
range=(0x174a00 ~ 0x174c00) nr_scanned=32 nr_taken=8
      khugepaged-106   [001] 35328.678227: mm_compaction_isolate_migratepages:
range=(0x167400 ~ 0x1674e0) nr_scanned=55 nr_taken=32
      khugepaged-106   [001] 35328.678250: mm_compaction_isolate_migratepages:
range=(0x1674e0 ~ 0x167500) nr_scanned=32 nr_taken=32
      khugepaged-106   [001] 35328.678271: mm_compaction_isolate_migratepages:
range=(0x167500 ~ 0x167600) nr_scanned=7 nr_taken=0
      khugepaged-106   [001] 35328.678272: mm_compaction_isolate_migratepages:
range=(0x165000 ~ 0x165100) nr_scanned=48 nr_taken=32
      khugepaged-106   [001] 35328.678295: mm_compaction_isolate_migratepages:
range=(0x165100 ~ 0x1651f8) nr_scanned=43 nr_taken=32
      khugepaged-106   [001] 35328.678316: mm_compaction_isolate_migratepages:
range=(0x1651f8 ~ 0x165200) nr_scanned=8 nr_taken=8
      khugepaged-106   [001] 35328.678323: mm_compaction_isolate_migratepages:
range=(0x156e00 ~ 0x157000) nr_scanned=52 nr_taken=16
      khugepaged-106   [001] 35328.678334: mm_compaction_isolate_migratepages:
range=(0x148800 ~ 0x148a00) nr_scanned=33 nr_taken=9
      khugepaged-106   [001] 35328.678341: mm_compaction_isolate_migratepages:
range=(0x11ce00 ~ 0x11d000) nr_scanned=41 nr_taken=15
      khugepaged-106   [001] 35328.678351: mm_compaction_isolate_migratepages:
range=(0x11ce00 ~ 0x11d000) nr_scanned=41 nr_taken=5

All of the isolated pages fail migration. Normally the range should be linearly
increasing, but here it's taking the ranges via fast_find_migrateblock(). This
should not be looping forever as each range should be tried only once. However
there's a commit in v6.1 - 7efc3b726103 ("mm/compaction: fix set skip in
fast_find_migrateblock") that removes recording the information to not repeat a
range from one place, and looks like it's possible under certain circumstances
to avoid the recording in the other places, and that's probably happened for
you.

I'm building a kernel with that commit reverted so you can test and confirm
whether that was it. I will post a link soon.


You are receiving this mail because: