Comment # 84 on bug 1159882 from Michal Hocko

Sorry for a late reply. I was busy with other issues

(In reply to Robert Delahunt from comment #75)
> In about five minutes:
> 
> This is with vmstats with your kernel with cgroups enabled:
> 
> http://www.puresimplicity.net/~delahunt/vmstat/mhocko4/
diff between the first and last snapshot
           1602004484   1602004607[diff]
pswpin          0       0
pswpout         0       48703
pgscan_kswapd   0       7177411
pgscan_direct   0       0
pgsteal_kswapd  0       7115294
pgsteal_direct  0       0

> This is with vmstats with your kernel with cgroups disabled:
> 
> http://www.puresimplicity.net/~delahunt/vmstat/mhocko5/
           1602004764   1602004958[diff]
pswpin          0       0
pswpout         0       75781
pgscan_kswapd   0       13728122
pgscan_direct   0       0
pgsteal_kswapd  0       13664819
pgsteal_direct  0       0

Both do swap out. The later covers a longer time period - 194s vs 123s and
scans twice as many pages which results in twice as many pages reclaimed and
55% more swapout.

>From that we can conclude (from a high level) that the swapout reflects the
overall reclaim and cgroups enabled/disabled doesn't play any major role here.
Which is a good confirmation because it would be really curious to see a
difference in the behavior just from having cgroups enabled without being used.

So let's focus on the cgroups enabled case for now. Let's have a look at
                  1602004840   1602004841[diff]
pswpin                 0           0
pswpout                3103        6801
pgsteal_kswapd         206006      148891
pgscan_kswapd          251736      148891
nr_active_anon         339953      -1840
nr_active_file         147150      7
nr_inactive_anon       38139       9382
nr_inactive_file       7263922     -2703
workingset_activate    170         0
workingset_refault     170         57
workingset_restore     0           0

>From this we can conclude that
- some active anonymous pages have been rotated to the inactive list which
grown much larger though - even when we consider the swapout. So there must be
some process allocating a nontrivial amount of anonymous memory and there is
more going on than just the IO test case
- There is a ton of inactive page cache to reclaim from
- refaults are quite marginal

So this is in line with previous observations. I am inclined to drop the two
patches mentioned earlier (comment 60) as they are known to contribute
considerably. Unless Vlastimil or Mel speak up.

At this moment I am not sure how much more time I can spend on this so I would
recommend to use a more recent kernel.

Btw. considering stalls. The data I have seen so far doesn't indicate any
reclaim induced source of a potential stall. There is no swap in neither no
direct reclaim. So existing reclaim decisions. Maybe in your regular workload
there is a considerable swapin (pswpin) going on.