[opensuse-kernel] "SLAB: Unable to allocate memory" vs. vm.swappiness & vm/vfs_cache_pressure?

28 Feb 2013

      On a 4GB RAM edge server running 12.2 w/ kernel 3.4.28/x86_64 I'm seeing
an increasing frequency of "SLAB: Unable to allocate memory on node 0"
(http://pastebin.com/xDPgtut8) errors in syslog, e.g.:

	cat /var/log/messages
		...
		Feb 28 09:24:30 core kernel: [777611.883853] The
		following is only an harmless informational message.
		Feb 28 09:24:30 core kernel: [777611.883860] Unless you
		get a _continuous_flood_ of these messages it means
		Feb 28 09:24:30 core kernel: [777611.883864] everything
		is working fine. Allocations from irqs cannot be
		Feb 28 09:24:30 core kernel: [777611.883868] perfectly
		reliable and the kernel is designed to handle that.
		Feb 28 09:24:30 core kernel: [777611.883873] swapper/2:
		page allocation failure. order:4, mode:0x20
		Feb 28 09:24:30 core kernel: [777611.883880] Pid: 0,
		comm: swapper/2 Tainted: P        W  O
		3.4.28-2.20-default #1
		Feb 28 09:24:30 core kernel: [777611.883885] Call Trace:
		Feb 28 09:24:30 core kernel: [777611.883908] 
		[<ffffffff81004598>] dump_trace+0x78/0x2c0
		Feb 28 09:24:30 core kernel: [777611.883921] 
		[<ffffffff81532e6a>] dump_stack+0x69/0x6f
		Feb 28 09:24:30 core kernel: [777611.883933] 
		[<ffffffff810fdf02>] warn_alloc_failed+0x102/0x1b0
		Feb 28 09:24:30 core kernel: [777611.883946] 
		[<ffffffff81101cfb>] __alloc_pages_nodemask+0x68b/0x9a0
		Feb 28 09:24:30 core kernel: [777611.883959] 
		[<ffffffff811401e2>] kmem_getpages+0x62/0x190
		Feb 28 09:24:30 core kernel: [777611.883969] 
		[<ffffffff81140ddf>] fallback_alloc+0x21f/0x270
		Feb 28 09:24:30 core kernel: [777611.883981] 
		[<ffffffff8114153b>] __kmalloc+0x14b/0x1d0
		Feb 28 09:24:30 core kernel: [777611.883994] 
		[<ffffffff8143e87c>] pskb_expand_head+0x8c/0x310
		Feb 28 09:24:30 core kernel: [777611.884006] 
		[<ffffffff814d9355>] ip6_forward+0x515/0x770
		Feb 28 09:24:30 core kernel: [777611.884027] 
		[<ffffffff81449712>] __netif_receive_skb+0x532/0x600
		Feb 28 09:24:30 core kernel: [777611.884039] 
		[<ffffffff81449881>] process_backlog+0xa1/0x180
		Feb 28 09:24:30 core kernel: [777611.884049] 
		[<ffffffff8144a280>] net_rx_action+0x140/0x230
		Feb 28 09:24:30 core kernel: [777611.884062] 
		[<ffffffff810459be>] __do_softirq+0xae/0x1c0
		Feb 28 09:24:30 core kernel: [777611.884073] 
		[<ffffffff8154724c>] call_softirq+0x1c/0x30
		Feb 28 09:24:30 core kernel: [777611.884082] 
		[<ffffffff810044e5>] do_softirq+0x75/0xb0
		Feb 28 09:24:30 core kernel: [777611.884094] 
		[<ffffffff81045d55>] irq_exit+0xa5/0xb0
		Feb 28 09:24:30 core kernel: [777611.884105] 
		[<ffffffff8100413b>] do_IRQ+0x5b/0xd0
		Feb 28 09:24:30 core kernel: [777611.884115] 
		[<ffffffff8153e5ea>] common_interrupt+0x6a/0x6a
		Feb 28 09:24:30 core kernel: [777611.884129] 
		[<ffffffff8131861c>] intel_idle+0xec/0x160
		Feb 28 09:24:30 core kernel: [777611.884141] 
		[<ffffffff81412bfd>] cpuidle_idle_call+0x9d/0x240
		Feb 28 09:24:30 core kernel: [777611.884152] 
		[<ffffffff8100b5f5>] cpu_idle+0x65/0xd0
		Feb 28 09:24:30 core kernel: [777611.884163] 
		[<ffffffff8152be3d>] start_secondary+0x208/0x20d
		Feb 28 09:24:30 core kernel: [777611.884171] SLAB:
		Unable to allocate memory on node 0 (gfp=0x20)
		Feb 28 09:24:30 core kernel: [777611.884176]   cache:
		size-65536, object size: 65536, order: 4
		Feb 28 09:24:30 core kernel: [777611.884184]   node 0:
		slabs: 19/19, objs: 19/19, free: 0
		Feb 28 09:24:30 core kernel: [777611.886007] The
		following is only an harmless informational message.
		Feb 28 09:24:30 core kernel: [777611.886015] Unless you
		get a _continuous_flood_ of these messages it means
		Feb 28 09:24:30 core kernel: [777611.886018] everything
		is working fine. Allocations from irqs cannot be
		Feb 28 09:24:30 core kernel: [777611.886022] perfectly
		reliable and the kernel is designed to handle that.
		Feb 28 09:24:30 core kernel: [777611.886027] kswapd0:
		page allocation failure. order:4, mode:0x20
		Feb 28 09:24:30 core kernel: [777611.886034] Pid: 35,
		comm: kswapd0 Tainted: P        W  O 3.4.28-2.20-default
		#1
		Feb 28 09:24:30 core kernel: [777611.886038] Call Trace:
		Feb 28 09:24:30 core kernel: [777611.886061] 
		[<ffffffff81004598>] dump_trace+0x78/0x2c0
		Feb 28 09:24:30 core kernel: [777611.886073] 
		[<ffffffff81532e6a>] dump_stack+0x69/0x6f
		Feb 28 09:24:30 core kernel: [777611.886085] 
		[<ffffffff810fdf02>] warn_alloc_failed+0x102/0x1b0
		Feb 28 09:24:30 core kernel: [777611.886097] 
		[<ffffffff81101cfb>] __alloc_pages_nodemask+0x68b/0x9a0
		Feb 28 09:24:30 core kernel: [777611.886110] 
		[<ffffffff811401e2>] kmem_getpages+0x62/0x190
		Feb 28 09:24:30 core kernel: [777611.886120] 
		[<ffffffff81140ddf>] fallback_alloc+0x21f/0x270
		Feb 28 09:24:30 core kernel: [777611.886132] 
		[<ffffffff8114153b>] __kmalloc+0x14b/0x1d0
		Feb 28 09:24:30 core kernel: [777611.886145] 
		[<ffffffff8143e87c>] pskb_expand_head+0x8c/0x310
		Feb 28 09:24:30 core kernel: [777611.886157] 
		[<ffffffff814d9355>] ip6_forward+0x515/0x770
		Feb 28 09:24:30 core kernel: [777611.886178] 
		[<ffffffff81449712>] __netif_receive_skb+0x532/0x600
		Feb 28 09:24:30 core kernel: [777611.886194] 
		[<ffffffff81449881>] process_backlog+0xa1/0x180
		Feb 28 09:24:30 core kernel: [777611.886210] 
		[<ffffffff8144a280>] net_rx_action+0x140/0x230
		Feb 28 09:24:30 core kernel: [777611.886229] 
		[<ffffffff810459be>] __do_softirq+0xae/0x1c0
		Feb 28 09:24:30 core kernel: [777611.886245] 
		[<ffffffff8154724c>] call_softirq+0x1c/0x30
		Feb 28 09:24:30 core kernel: [777611.886259] 
		[<ffffffff810044e5>] do_softirq+0x75/0xb0
		Feb 28 09:24:30 core kernel: [777611.886276] 
		[<ffffffff81045d55>] irq_exit+0xa5/0xb0
		Feb 28 09:24:30 core kernel: [777611.886291] 
		[<ffffffff8100413b>] do_IRQ+0x5b/0xd0
		Feb 28 09:24:30 core kernel: [777611.886304] 
		[<ffffffff8153e5ea>] common_interrupt+0x6a/0x6a
		Feb 28 09:24:30 core kernel: [777611.886319] 
		[<ffffffff8153e1d9>]
		_raw_spin_unlock_irqrestore+0x9/0x30
		Feb 28 09:24:30 core kernel: [777611.886329] 
		[<ffffffff8111a1b8>] compaction_alloc+0x288/0x2b0
		Feb 28 09:24:30 core kernel: [777611.886342] 
		[<ffffffff81144d83>] migrate_pages+0xb3/0x500
		Feb 28 09:24:30 core kernel: [777611.886352] 
		[<ffffffff8111ad47>] compact_zone+0x1d7/0x320
		Feb 28 09:24:30 core kernel: [777611.886364] 
		[<ffffffff8111af3b>] __compact_pgdat+0xab/0x170
		Feb 28 09:24:30 core kernel: [777611.886374] 
		[<ffffffff8111b237>] compact_pgdat+0x27/0x30
		Feb 28 09:24:30 core kernel: [777611.886384] 
		[<ffffffff8110dc8c>] balance_pgdat+0x6cc/0x710
		Feb 28 09:24:30 core kernel: [777611.886396] 
		[<ffffffff8110de18>] kswapd+0x148/0x370
		Feb 28 09:24:30 core kernel: [777611.886407] 
		[<ffffffff8105ec55>] kthread+0x85/0x90
		Feb 28 09:24:30 core kernel: [777611.886417] 
		[<ffffffff81547154>] kernel_thread_helper+0x4/0x10
		Feb 28 09:24:30 core kernel: [777611.886426] SLAB:
		Unable to allocate memory on node 0 (gfp=0x20)
		Feb 28 09:24:30 core kernel: [777611.886430]   cache:
		size-65536, object size: 65536, order: 4
		Feb 28 09:24:30 core kernel: [777611.886439]   node 0:
		slabs: 15/19, objs: 15/19, free: 4
		Feb 28 09:24:38 core kernel: [777620.441021] The
		following is only an harmless informational message.
		Feb 28 09:24:38 core kernel: [777620.441029] Unless you
		get a _continuous_flood_ of these messages it means
		Feb 28 09:24:38 core kernel: [777620.441033] everything
		is working fine. Allocations from irqs cannot be
		Feb 28 09:24:38 core kernel: [777620.441036] perfectly
		reliable and the kernel is designed to handle that.
		Feb 28 09:24:38 core kernel: [777620.441042] swapper/1:
		page allocation failure. order:4, mode:0x20
		Feb 28 09:24:38 core kernel: [777620.441048] Pid: 0,
		comm: swapper/1 Tainted: P        W  O
		3.4.28-2.20-default #1
		Feb 28 09:24:38 core kernel: [777620.441053] Call Trace:
		Feb 28 09:24:38 core kernel: [777620.441076] 
		[<ffffffff81004598>] dump_trace+0x78/0x2c0
		Feb 28 09:24:38 core kernel: [777620.441089] 
		[<ffffffff81532e6a>] dump_stack+0x69/0x6f
		...

Atm, `slabtop` shows 99% usage for "ext4_inode_cache":

	slabtop
		 Active / Total Objects (% used)    : 1275578 / 1419483
		 (89.9%)
		 Active / Total Slabs (% used)      : 187764 / 187786
		 (100.0%)
		 Active / Total Caches (% used)     : 109 / 232 (47.0%)
		 Active / Total Size (% used)       : 639586.50K /
		 660294.86K (96.9%)
		 Minimum / Average / Maximum Object : 0.02K / 0.46K /
		 4096.00K

		  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE
		  NAME                  
		606252 601064  99%    0.85K 151563        4    606252K
		ext4_inode_cache
		380397 283799  74%    0.10K  10281       37     41124K
		buffer_head
		241600 231794  95%    0.19K  12080       20     48320K
		dentry
		 34986  34973  99%    0.54K   4998        7     19992K
		 inode_cache
		 27435  12674  46%    0.06K    465       59      1860K
		 size-64
		 21952  21115  96%    0.03K    196      112       784K
		 size-32
		 18734  18690  99%    0.11K    551       34      2204K
		 sysfs_dir_cache
		 16086  14359  89%    0.55K   2298        7      9192K
		 radix_tree_node
		 13475  11758  87%    0.05K    175       77       700K
		 jbd2_inode
		...

This is with default values of:

	cat /proc/sys/vm/vfs_cache_pressure
		100
	cat /proc/sys/vm/swappiness 
		60

Reading @
http://doc.opensuse.org/documentation/html/openSUSE/opensuse-tuning/cha.tuni...,
and taking a random stab at changing

	echo 200 > /proc/sys/vm/vfs_cache_pressure
	echo 25  > /proc/sys/vm/swappiness 

Seems to reduce -- thought not completely eliminate -- the syslog
errors.  `slabtop` shows no change.

After dropping cache

	sync && echo 3 > /proc/sys/vm/drop_caches

`slabtop` usage for "ext4-inode_cache" seems to have dropped to 74%,
although other items are still ~99%:

	slabtop
	    Active / Total Objects (% used)    : 130790 / 216154 (60.5%)
	     Active / Total Slabs (% used)      : 12482 / 12506 (99.8%)
	     Active / Total Caches (% used)     : 109 / 232 (47.0%)
	     Active / Total Size (% used)       : 40954.73K / 54620.83K
	     (75.0%)
	     Minimum / Average / Maximum Object : 0.02K / 0.25K /
	     4096.00K

	      OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME  
	     60740  22518  37%    0.19K   3037       20     12148K
	     dentry
	     23423   8006  34%    0.06K    397       59      1588K
	     size-64
!!           21952  21119  96%    0.03K    196      112       784K
size-32
!!           19397  19126  98%    0.54K   2771        7     11084K
inode_cache
!!           18734  18690  99%    0.11K    551       34      2204K
sysfs_dir_cache
	     16687   4514  27%    0.10K    451       37      1804K
	     buffer_head
	      4543   2173  47%    0.05K     59       77       236K
	      anon_vma_chain
	      4200   3013  71%    0.12K    140       30       560K
	      size-128
!!            4088   4072  99%    0.13K    146       28       584K
ext4_groupinfo_4k
!!            4084   4079  99%    1.00K   1021        4      4084K
size-1024
	      3784   2956  78%    0.17K    172       22       688K
	      vm_area_struct
	      3283    962  29%    0.55K    469        7      1876K
	      radix_tree_node
	      2596   1484  57%    0.06K     44       59       176K
	      anon_vma  
	      2160   1798  83%    0.02K     15      144        60K
	      dm_target_io
	      2116   1820  86%    0.04K     23       92        92K dm_io
	      2020   1687  83%    0.19K    101       20       404K
	      size-192
	      1925     45   2%    0.05K     25       77       100K
	      jbd2_inode
	      1845   1108  60%    0.25K    123       15       492K filp
	      1431   1344  93%    0.07K     27       53       108K
	      Acpi-Operand
-->           1408   1042  74%    0.85K    352        4      1408K
ext4_inode_cache
		...

With this config, post clear, for ~ 20 minutes, no more SLAB errors in
syslog.

My understanding of the effect/beahvior of Memory Subsytem tuning is
rudimentary at best.

Chatting in #opensuse-server, IIUC, it seems that it's not leaking, but
not automatically freeing up enough memory.

(1) Are the SLAB errors I'm seeing indicative of a problem in the sysctl
setting for swappiness & vfs_cache_pressure, particularly for server
use?  Or something else?

(2) What, if any, specific/initial guidance or recommendation exists for
these values for Opensuse server, not desktop, usage?

Thanks.

/C
-- 
To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org

chrysippus＠operamail.com

tags

participants (4)