[Bug 444597] New: order-0 page allocation error
https://bugzilla.novell.com/show_bug.cgi?id=444597 User tiwai@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=444597#c1 Summary: order-0 page allocation error Product: openSUSE 11.1 Version: Factory Platform: x86-64 OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: tiwai@novell.com QAContact: qa@suse.de Found By: --- 2.6.27.5-1.1-default kernel (IIRC also beta4 kernel, too) gives 0-order page allocation errors often after one day run. swapper: page allocation failure. order:0, mode:0x20, alloc_flags:0x7, pflags:0x 10200042 Pid: 0, comm: swapper Not tainted 2.6.27.5-1.1-default #1 Call Trace: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58 [<ffffffff804a8fd5>] dump_stack+0x69/0x6f [<ffffffff8028f246>] __alloc_pages_internal+0x3eb/0x40b [<ffffffff802b42d0>] kmem_getpages+0x6f/0x12a [<ffffffff802b4c3e>] fallback_alloc+0x15a/0x20a [<ffffffff802b514d>] kmem_cache_alloc+0x137/0x16c [<ffffffffa0042de4>] scsi_pool_alloc_command+0x14/0x5b [scsi_mod] [<ffffffffa0042e3d>] scsi_host_alloc_command+0x12/0x56 [scsi_mod] [<ffffffffa0042f36>] __scsi_get_command+0xc/0x7a [scsi_mod] [<ffffffffa0042fd4>] scsi_get_command+0x30/0x98 [scsi_mod] [<ffffffffa0048673>] scsi_setup_fs_cmnd+0x69/0xb8 [scsi_mod] [<ffffffffa0131155>] sd_prep_fn+0x65/0x860 [sd_mod] [<ffffffff80345e81>] elv_next_request+0x153/0x20c [<ffffffffa0047ac2>] scsi_request_fn+0x88/0x52b [scsi_mod] [<ffffffff8034799d>] blk_invoke_request_fn+0x70/0x157 [<ffffffff80347f94>] blk_run_queue+0x21/0x35 [<ffffffffa00481d5>] scsi_next_command+0x2d/0x39 [scsi_mod] [<ffffffffa0048396>] scsi_end_request+0x86/0x98 [scsi_mod] [<ffffffffa00489ac>] scsi_io_completion+0x1a1/0x3ac [scsi_mod] [<ffffffff8034b9fc>] blk_done_softirq+0x70/0x7d [<ffffffff80246ba9>] __do_softirq+0x84/0x115 [<ffffffff8020ddac>] call_softirq+0x1c/0x28 DWARF2 unwinder stuck at call_softirq+0x1c/0x28 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User tiwai@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c1
--- Comment #1 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=444597
User tiwai@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c2
--- Comment #2 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=444597
Cyril Hrubis
https://bugzilla.novell.com/show_bug.cgi?id=444597
Greg Kroah-Hartman
https://bugzilla.novell.com/show_bug.cgi?id=444597
User hare@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c3
Hannes Reinecke
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c4
Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User tiwai@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c6
--- Comment #6 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=444597
User tiwai@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c7
--- Comment #7 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=444597
User tiwai@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c8
--- Comment #8 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=444597
User teheo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c9
Tejun Heo
https://bugzilla.novell.com/show_bug.cgi?id=444597
User delder@novacoast.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c10
Dan Elder
https://bugzilla.novell.com/show_bug.cgi?id=444597
User tiwai@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c11
Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=444597
Stephan Kulow
https://bugzilla.novell.com/show_bug.cgi?id=444597
Lars Marowsky-Bree
https://bugzilla.novell.com/show_bug.cgi?id=444597
User jkosina@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c12
Jiri Kosina
Yes, there is a memory leak, as can be confirmed by running
while true; do free; sleep 1; done
on an otherwise idle machine. total used free shared buffers cached Mem: 8180024 208180 7971844 0 6588 72896 -/+ buffers/cache: 128696 8051328 Swap: 8393952 0 8393952 total used free shared buffers cached Mem: 8180024 208384 7971640 0 6588 72896 -/+ buffers/cache: 128900 8051124 Swap: 8393952 0 8393952 total used free shared buffers cached Mem: 8180024 208204 7971820 0 6588 72900 -/+ buffers/cache: 128716 8051308 Swap: 8393952 0 8393952 total used free shared buffers cached Mem: 8180024 208296 7971728 0 6588 72900 -/+ buffers/cache: 128808 8051216 Swap: 8393952 0 8393952 total used free shared buffers cached Mem: 8180024 208296 7971728 0 6588 72900 -/+ buffers/cache: 128808 8051216 Swap: 8393952 0 8393952 total used free shared buffers cached Mem: 8180024 208296 7971728 0 6596 72904 -/+ buffers/cache: 128796 8051228 Swap: 8393952 0 8393952 total used free shared buffers cached Mem: 8180024 208340 7971684 0 6596 72904 -/+ buffers/cache: 128840 8051184 Swap: 8393952 0 8393952 total used free shared buffers cached Mem: 8180024 208436 7971588 0 6596 72904 -/+ buffers/cache: 128936 8051088 Swap: 8393952 0 8393952 total used free shared buffers cached Mem: 8180024 208520 7971504 0 6596 72904 -/+ buffers/cache: 129020 8051004 Swap: 8393952 0 8393952
please note, the machine is _idle_, and all SCSI modules are unloaded. So I _really_ doubt it's a SCSI issue. Upgrading to blocker.
I will try to reproduce it here ... meanwhile, does anyone who is able to reproduce the bug immediately have output of while true; cat /proc/slabinfo; sleep 10; done It would be interesting to see which slab grows exceedingly before the OOM triggers. Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User tiwai@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c13
Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c14
--- Comment #14 from Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User tiwai@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c15
--- Comment #15 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=444597
User uwe.menges@sap.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c16
--- Comment #16 from Uwe Menges
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c17
--- Comment #17 from Miklos Szeredi
Note that this happens not only in SCSI code, as found in comment #4, #7, #10.
Well, but how often do these happen? Occasional GFP_ATOMIC failure is perfectly fine, in fact we've had a patch clarifying this: "patches.fixes/oom-warning". Unfortunately that got reverted by "patches.suse/SoN-11-mm-page_alloc-emerg.patch" probably accidentally. That said, it's not good if we have a sudden increase in this kind of allocation failure in recent kernels. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User tiwai@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c18
--- Comment #18 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c19
--- Comment #19 from Miklos Szeredi
As far as I've seen, it seems that the other paths occur more often than the SCSI code path. And, in other code paths, it's no GFP_ATOMIC.
Hmm, you're right. But they _are_ non-waiting allocations. The "mode=0x10000" seems to be common which is __GFP_NOMEMALLOC, which is even less likely to succeed than GFP_ATOMIC. Some of those are coming from mempool allocations, and mempool adds __GFP_NOWARN to the flags, but that seems to get cleared somewhere. The problem is there's so many reports in one bug, and there's not much commonality between them, except that all of them are atomic allocations. I'll try to sort them out and find some pattern. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c20
Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c21
Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User coolo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c22
Stephan Kulow
https://bugzilla.novell.com/show_bug.cgi?id=444597
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c23
Suresh Jayaraman
This is a summary of the different cases from all 3 bugs:
mode:0x20 scsi_setup_fs_cmnd mode:0x20 packet_rcv_spkt mode:0x10020 e1000_alloc_rx_buffers mode:0x10020 mempool_alloc/__sg_alloc_table mode:0x10000 mempool_alloc/bvec_alloc_bs mode:0x10080 mempool_alloc/bvec_alloc_bs mode:0x10000 mempool_alloc/rpc_malloc
The SCSI one is found in all 3 bug reports, so that seems to be fairly common. And the mempool one is common as well.
There are several questions:
1) Why did these allocation failures started appearing in recent 11.1/SLES11 kernels?
2) Why is the __GFP_NOWARN flag not being propagated from mempool_alloc() down to __alloc_pages_internal()?
3) Why did the notice in patches.fixes/oom-warning got reverted?
It was unintentional. Since the patch was not present in either upstream or -mm, thought, it's just bunch of printk's are no more required. Sorry about that.
For 3 the Swap over NFS patches are responsible, and it might be that for the others as well, since they do touch these codepaths.
One way to narrow down is that if this is not reproducible in beta1 or beta2 and reproducible only starting from beta3 (when swap over nfs went in), it's possible that it was introduced by SoN patches.
Suresh, Neil, do you have any ideas?
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c24
--- Comment #24 from Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c25
--- Comment #25 from Miklos Szeredi
One way to narrow down is that if this is not reproducible in beta1 or beta2 and reproducible only starting from beta3 (when swap over nfs went in), it's possible that it was introduced by SoN patches.
The earliest report is exactly SLES11-beta3, so I think it's a definite possibility that the SoN patches did something which made the atomic allocations less likely to succeed. I'll look through the patches with this perspective. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c26
Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c27
Andreas Jaeger
From dmesg:
The following is only an harmless informational message. Unless you get a _continuous_flood_ of these messages it means everything is working fine. Allocations from irqs cannot be perfectly reliable and the kernel is designed to handle that. as: page allocation failure. order:0, mode:0x20, alloc_flags:0x7, pflags:0x400000 Pid: 16579, comm: as Tainted: G 2.6.27.7-HEAD_20081126094648-default #1 Call Trace: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58 [<ffffffff804a8d37>] dump_stack+0x69/0x6f [<ffffffff8028ee09>] __alloc_pages_internal+0x422/0x442 [<ffffffff802b404c>] kmem_getpages+0x6f/0x12a [<ffffffff802b49af>] fallback_alloc+0x15a/0x20a [<ffffffff802b4ea6>] kmem_cache_alloc+0x12a/0x15f [<ffffffffa0021de4>] scsi_pool_alloc_command+0x14/0x5b [scsi_mod] [<ffffffffa0021e3d>] scsi_host_alloc_command+0x12/0x56 [scsi_mod] [<ffffffffa0021f36>] __scsi_get_command+0xc/0x7a [scsi_mod] [<ffffffffa0021fd4>] scsi_get_command+0x30/0x98 [scsi_mod] [<ffffffffa0027815>] scsi_setup_fs_cmnd+0x69/0xb8 [scsi_mod] [<ffffffffa0126155>] sd_prep_fn+0x65/0x860 [sd_mod] [<ffffffff80345d41>] elv_next_request+0x153/0x20c [<ffffffffa0026c6a>] scsi_request_fn+0x88/0x52b [scsi_mod] [<ffffffff80345fdf>] elv_insert+0x1e5/0x2b2 [<ffffffff80348b1b>] __make_request+0x40e/0x48d [<ffffffff803471a8>] generic_make_request+0x399/0x3dc [<ffffffff803472a8>] submit_bio+0xbd/0xc4 [<ffffffff802e471e>] mpage_bio_submit+0x22/0x26 [<ffffffff802e523c>] mpage_readpages+0xe1/0xf5 [<ffffffff80290904>] __do_page_cache_readahead+0xff/0x18c [<ffffffff802893a1>] filemap_fault+0x162/0x326 [<ffffffff80298294>] __do_fault+0x56/0x450 [<ffffffff8029acd8>] handle_mm_fault+0x391/0x53d [<ffffffff804ad408>] do_page_fault+0x1ef/0x5f0 [<ffffffff804ab10a>] error_exit+0x0/0x70 DWARF2 unwinder stuck at error_exit+0x0/0x70 Leftover inexact backtrace: Mem-Info: Node 0 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 Node 0 DMA32 per-cpu: CPU 0: hi: 186, btch: 31 usd: 173 CPU 1: hi: 186, btch: 31 usd: 175 CPU 2: hi: 186, btch: 31 usd: 187 CPU 3: hi: 186, btch: 31 usd: 180 Node 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 133 CPU 1: hi: 186, btch: 31 usd: 147 CPU 2: hi: 186, btch: 31 usd: 103 CPU 3: hi: 186, btch: 31 usd: 151 Active:176859 inactive:7386 dirty:2525 writeback:154 unstable:0 free:3036 slab:815852 mapped:8748 pagetables:1658 bounce:0 Node 0 DMA free:6996kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:5016kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 3376 3943 3943 Node 0 DMA32 free:4764kB min:6872kB low:8588kB high:10308kB active:405576kB inactive:13408kB present:3457216kB pages_scanned:771 all_unreclaimable? no lowmem_reserve[]: 0 0 567 567 Node 0 Normal free:384kB min:1152kB low:1440kB high:1728kB active:301860kB inactive:16136kB present:580608kB pages_scanned:108731 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 Node 0 DMA: 5*4kB 6*8kB 3*16kB 3*32kB 4*64kB 3*128kB 2*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 6996kB Node 0 DMA32: 376*4kB 3*8kB 1*16kB 2*32kB 0*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 4808kB Node 0 Normal: 1*4kB 1*8kB 1*16kB 1*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 316kB 19604 total pagecache pages 0 pages in swap cache Swap cache stats: add 33, delete 33, find 0/0 Free swap = 2104356kB Total swap = 2104472kB 1032176 pages RAM 18897 pages reserved 39002 pages shared 987947 pages non-shared The following is only an harmless informational message. Unless you get a _continuous_flood_ of these messages it means everything is working fine. Allocations from irqs cannot be perfectly reliable and the kernel is designed to handle that. umount.nfs: page allocation failure. order:0, mode:0x20, alloc_flags:0x7, pflags:0x400100 Pid: 16607, comm: umount.nfs Tainted: G 2.6.27.7-HEAD_20081126094648-default #1 Call Trace: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58 [<ffffffff804a8d37>] dump_stack+0x69/0x6f [<ffffffff8028ee09>] __alloc_pages_internal+0x422/0x442 [<ffffffff802b404c>] kmem_getpages+0x6f/0x12a [<ffffffff802b49af>] fallback_alloc+0x15a/0x20a [<ffffffff802b4ea6>] kmem_cache_alloc+0x12a/0x15f [<ffffffff8045fda3>] inet_twsk_alloc+0x25/0xdc [<ffffffff80473d25>] tcp_time_wait+0x59/0x1b6 [<ffffffff80468d67>] tcp_fin+0x7a/0x183 [<ffffffff804698b4>] tcp_data_queue+0x30f/0xbf8 [<ffffffff8046be3f>] tcp_rcv_state_process+0x67f/0x81c [<ffffffff80471f16>] tcp_v4_do_rcv+0xaf/0xfb [<ffffffff80472420>] tcp_v4_rcv+0x4be/0x740 [<ffffffff804587cb>] ip_local_deliver_finish+0x11c/0x1ee [<ffffffff80458423>] ip_rcv_finish+0x30b/0x325 [<ffffffff80437ae6>] netif_receive_skb+0x485/0x4df [<ffffffffa021fbba>] tg3_rx+0x3fc/0x51f [tg3] [<ffffffffa021fd95>] tg3_poll_work+0xb8/0xc6 [tg3] [<ffffffffa021fdcd>] tg3_poll+0x2a/0x1ae [tg3] [<ffffffff80435e99>] net_rx_action+0xb5/0x208 [<ffffffff8024696d>] __do_softirq+0x84/0x115 [<ffffffff8020ddac>] call_softirq+0x1c/0x28 DWARF2 unwinder stuck at call_softirq+0x1c/0x28 Leftover inexact backtrace: <IRQ> [<ffffffff8020f177>] do_softirq+0x3c/0x81 [<ffffffff80246684>] irq_exit+0x3f/0x83 [<ffffffff8020f464>] do_IRQ+0xbd/0xda [<ffffffff8020caae>] ret_from_intr+0x0/0x29 <EOI> [<ffffffff802927d0>] shrink_active_list+0x417/0x4e2 [<ffffffff802927ca>] shrink_active_list+0x411/0x4e2 [<ffffffff80293502>] shrink_zone+0xe5/0x126 [<ffffffff804aad51>] _spin_lock_irqsave+0x2e/0x35 [<ffffffff80293873>] shrink_zones+0xe2/0x119 [<ffffffff802944dd>] do_try_to_free_pages+0x150/0x2a3 [<ffffffff80294714>] try_to_free_pages+0x60/0x65 [<ffffffff8029222c>] isolate_pages_global+0x0/0x2f [<ffffffff8028ec40>] __alloc_pages_internal+0x259/0x442 [<ffffffff802b404c>] kmem_getpages+0x6f/0x12a [<ffffffff802b49af>] fallback_alloc+0x15a/0x20a [<ffffffff802b4ea6>] kmem_cache_alloc+0x12a/0x15f [<ffffffff8042e8a7>] sk_prot_alloc+0x24/0xa6 [<ffffffff8042e946>] sk_alloc+0x1d/0x53 [<ffffffff8047dcf6>] inet_create+0x165/0x29e [<ffffffff8042bf49>] __sock_create+0x14a/0x1df [<ffffffff8042c263>] sys_socket+0x24/0x56 [<ffffffff802b9983>] sys_close+0xad/0xfb [<ffffffff8020c3fa>] system_call_fastpath+0x16/0x1b Mem-Info: Node 0 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 Node 0 DMA32 per-cpu: CPU 0: hi: 186, btch: 31 usd: 164 CPU 1: hi: 186, btch: 31 usd: 160 CPU 2: hi: 186, btch: 31 usd: 35 CPU 3: hi: 186, btch: 31 usd: 0 Node 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 141 CPU 1: hi: 186, btch: 31 usd: 173 CPU 2: hi: 186, btch: 31 usd: 152 CPU 3: hi: 186, btch: 31 usd: 58 Active:169104 inactive:4596 dirty:0 writeback:10 unstable:0 free:3090 slab:826318 mapped:2370 pagetables:1893 bounce:0 Node 0 DMA free:6996kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:5016kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 3376 3943 3943 Node 0 DMA32 free:4944kB min:6872kB low:8588kB high:10308kB active:382868kB inactive:11540kB present:3457216kB pages_scanned:96 all_unreclaimable? no lowmem_reserve[]: 0 0 567 567 Node 0 Normal free:420kB min:1152kB low:1440kB high:1728kB active:293768kB inactive:6844kB present:580608kB pages_scanned:5864 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 Node 0 DMA: 5*4kB 6*8kB 3*16kB 3*32kB 4*64kB 3*128kB 2*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 6996kB Node 0 DMA32: 319*4kB 4*8kB 2*16kB 9*32kB 1*64kB 4*128kB 5*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 5020kB Node 0 Normal: 0*4kB 1*8kB 4*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 424kB 19372 total pagecache pages 10451 pages in swap cache Swap cache stats: add 165399, delete 154948, find 4029/9239 Free swap = 1601264kB Total swap = 2104472kB 1032176 pages RAM 18897 pages reserved 30040 pages shared 991819 pages non-shared TCP: time wait bucket table overflow The following is only an harmless informational message. Unless you get a _continuous_flood_ of these messages it means everything is working fine. Allocations from irqs cannot be perfectly reliable and the kernel is designed to handle that. swapper: page allocation failure. order:0, mode:0x20, alloc_flags:0x7, pflags:0x10200042 Pid: 0, comm: swapper Tainted: G 2.6.27.7-HEAD_20081126094648-default #1 Call Trace: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58 [<ffffffff804a8d37>] dump_stack+0x69/0x6f [<ffffffff8028ee09>] __alloc_pages_internal+0x422/0x442 [<ffffffff802b404c>] kmem_getpages+0x6f/0x12a [<ffffffff802b49af>] fallback_alloc+0x15a/0x20a [<ffffffff802b4ea6>] kmem_cache_alloc+0x12a/0x15f [<ffffffff8043b83f>] dst_alloc+0x29/0x6e [<ffffffffa0325428>] icmp6_dst_alloc+0x44/0x165 [ipv6] [<ffffffffa0328d4f>] __ndisc_send+0x69/0x45a [ipv6] [<ffffffffa0329251>] ndisc_send_na+0x111/0x120 [ipv6] [<ffffffffa03296cb>] ndisc_recv_ns+0x433/0x496 [ipv6] [<ffffffffa032ac15>] ndisc_rcv+0x8d/0xbc [ipv6] [<ffffffffa032f7a3>] icmpv6_rcv+0x494/0x568 [ipv6] [<ffffffffa031bb87>] ip6_input_finish+0x1d7/0x367 [ipv6] [<ffffffff80437ae6>] netif_receive_skb+0x485/0x4df [<ffffffffa021fbba>] tg3_rx+0x3fc/0x51f [tg3] [<ffffffffa021fd95>] tg3_poll_work+0xb8/0xc6 [tg3] [<ffffffffa021fdcd>] tg3_poll+0x2a/0x1ae [tg3] [<ffffffff80435e99>] net_rx_action+0xb5/0x208 [<ffffffff8024696d>] __do_softirq+0x84/0x115 [<ffffffff8020ddac>] call_softirq+0x1c/0x28 DWARF2 unwinder stuck at call_softirq+0x1c/0x28 Leftover inexact backtrace: <IRQ> [<ffffffff8020f177>] do_softirq+0x3c/0x81 [<ffffffff80246684>] irq_exit+0x3f/0x83 [<ffffffff8020f464>] do_IRQ+0xbd/0xda [<ffffffff8020caae>] ret_from_intr+0x0/0x29 <EOI> [<ffffffff80221f1d>] native_safe_halt+0x2/0x3 [<ffffffff80221f1d>] native_safe_halt+0x2/0x3 [<ffffffff80213575>] default_idle+0x38/0x54 [<ffffffff8020b3b5>] cpu_idle+0xa9/0xf1 Mem-Info: Node 0 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 Node 0 DMA32 per-cpu: CPU 0: hi: 186, btch: 31 usd: 18 CPU 1: hi: 186, btch: 31 usd: 17 CPU 2: hi: 186, btch: 31 usd: 30 CPU 3: hi: 186, btch: 31 usd: 0 Node 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 78 CPU 1: hi: 186, btch: 31 usd: 37 CPU 2: hi: 186, btch: 31 usd: 30 CPU 3: hi: 186, btch: 31 usd: 0 Active:169272 inactive:4487 dirty:0 writeback:0 unstable:0 free:2996 slab:827401 mapped:2272 pagetables:1860 bounce:0 Node 0 DMA free:6996kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:5016kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 3376 3943 3943 Node 0 DMA32 free:4732kB min:6872kB low:8588kB high:10308kB active:378468kB inactive:13992kB present:3457216kB pages_scanned:23812 all_unreclaimable? no lowmem_reserve[]: 0 0 567 567 Node 0 Normal free:256kB min:1152kB low:1440kB high:1728kB active:298620kB inactive:3956kB present:580608kB pages_scanned:19243 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 Node 0 DMA: 5*4kB 6*8kB 3*16kB 3*32kB 4*64kB 3*128kB 2*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 6996kB Node 0 DMA32: 371*4kB 4*8kB 4*16kB 8*32kB 1*64kB 0*128kB 5*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4716kB Node 0 Normal: 20*4kB 8*8kB 4*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 272kB 19387 total pagecache pages 10730 pages in swap cache Swap cache stats: add 188295, delete 177565, find 5964/13331 Free swap = 1572932kB Total swap = 2104472kB 1032176 pages RAM 18897 pages reserved 30550 pages shared 990625 pages non-shared The following is only an harmless informational message. Unless you get a _continuous_flood_ of these messages it means everything is working fine. Allocations from irqs cannot be perfectly reliable and the kernel is designed to handle that. cc1: page allocation failure. order:0, mode:0x20, alloc_flags:0x7, pflags:0x402000 Pid: 16566, comm: cc1 Tainted: G 2.6.27.7-HEAD_20081126094648-default #1 Call Trace: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58 [<ffffffff804a8d37>] dump_stack+0x69/0x6f [<ffffffff8028ee09>] __alloc_pages_internal+0x422/0x442 [<ffffffff802b404c>] kmem_getpages+0x6f/0x12a [<ffffffff802b49af>] fallback_alloc+0x15a/0x20a [<ffffffff802b4ea6>] kmem_cache_alloc+0x12a/0x15f [<ffffffff8043b83f>] dst_alloc+0x29/0x6e [<ffffffffa0325428>] icmp6_dst_alloc+0x44/0x165 [ipv6] [<ffffffffa0328d4f>] __ndisc_send+0x69/0x45a [ipv6] [<ffffffffa0329251>] ndisc_send_na+0x111/0x120 [ipv6] [<ffffffffa03296cb>] ndisc_recv_ns+0x433/0x496 [ipv6] [<ffffffffa032ac15>] ndisc_rcv+0x8d/0xbc [ipv6] [<ffffffffa032f7a3>] icmpv6_rcv+0x494/0x568 [ipv6] [<ffffffffa031bb87>] ip6_input_finish+0x1d7/0x367 [ipv6] [<ffffffff80437ae6>] netif_receive_skb+0x485/0x4df [<ffffffffa021fbba>] tg3_rx+0x3fc/0x51f [tg3] [<ffffffffa021fd95>] tg3_poll_work+0xb8/0xc6 [tg3] [<ffffffffa021fdcd>] tg3_poll+0x2a/0x1ae [tg3] [<ffffffff80435e99>] net_rx_action+0xb5/0x208 [<ffffffff8024696d>] __do_softirq+0x84/0x115 [<ffffffff8020ddac>] call_softirq+0x1c/0x28 DWARF2 unwinder stuck at call_softirq+0x1c/0x28 Leftover inexact backtrace: <IRQ> [<ffffffff8020f177>] do_softirq+0x3c/0x81 [<ffffffff80246684>] irq_exit+0x3f/0x83 [<ffffffff8020f464>] do_IRQ+0xbd/0xda [<ffffffff8020caae>] ret_from_intr+0x0/0x29 <EOI> [<ffffffff80292875>] shrink_active_list+0x4bc/0x4e2 [<ffffffff8029286f>] shrink_active_list+0x4b6/0x4e2 [<ffffffff80293502>] shrink_zone+0xe5/0x126 [<ffffffff80293873>] shrink_zones+0xe2/0x119 [<ffffffff802944dd>] do_try_to_free_pages+0x150/0x2a3 [<ffffffff80294714>] try_to_free_pages+0x60/0x65 [<ffffffff8029222c>] isolate_pages_global+0x0/0x2f [<ffffffff8028ec40>] __alloc_pages_internal+0x259/0x442 [<ffffffff802a622e>] read_swap_cache_async+0x58/0xcb [<ffffffff802a62f8>] swapin_readahead+0x57/0x98 [<ffffffff80298e28>] do_swap_page+0xf6/0x502 [<ffffffff802771cd>] res_counter_charge+0x4b/0x52 [<ffffffff804aad51>] _spin_lock_irqsave+0x2e/0x35 [<ffffffff8029acfc>] handle_mm_fault+0x3b5/0x53d [<ffffffff804aae12>] _spin_lock+0x13/0x15 [<ffffffff8029820b>] do_anonymous_page+0x211/0x244 [<ffffffff804aad51>] _spin_lock_irqsave+0x2e/0x35 [<ffffffff804ad408>] do_page_fault+0x1ef/0x5f0 [<ffffffff802a092e>] mmap_region+0x3fa/0x4fc [<ffffffff802a0d3d>] do_mmap_pgoff+0x30d/0x370 [<ffffffff804aad51>] _spin_lock_irqsave+0x2e/0x35 [<ffffffff803711dc>] __up_write+0x21/0x10f [<ffffffff804ab10a>] error_exit+0x0/0x70 Mem-Info: Node 0 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 Node 0 DMA32 per-cpu: CPU 0: hi: 186, btch: 31 usd: 44 CPU 1: hi: 186, btch: 31 usd: 74 CPU 2: hi: 186, btch: 31 usd: 38 CPU 3: hi: 186, btch: 31 usd: 110 Node 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 105 CPU 1: hi: 186, btch: 31 usd: 133 CPU 2: hi: 186, btch: 31 usd: 47 CPU 3: hi: 186, btch: 31 usd: 49 Active:168468 inactive:3580 dirty:0 writeback:1 unstable:0 free:3048 slab:828659 mapped:2280 pagetables:1860 bounce:0 Node 0 DMA free:6996kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:5016kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 3376 3943 3943 Node 0 DMA32 free:4800kB min:6872kB low:8588kB high:10308kB active:377856kB inactive:9488kB present:3457216kB pages_scanned:127384 all_unreclaimable? no lowmem_reserve[]: 0 0 567 567 Node 0 Normal free:396kB min:1152kB low:1440kB high:1728kB active:296016kB inactive:4832kB present:580608kB pages_scanned:97819 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 Node 0 DMA: 5*4kB 6*8kB 3*16kB 3*32kB 4*64kB 3*128kB 2*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 6996kB Node 0 DMA32: 322*4kB 4*8kB 4*16kB 8*32kB 1*64kB 0*128kB 5*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4520kB Node 0 Normal: 19*4kB 8*8kB 4*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 268kB 17803 total pagecache pages 9242 pages in swap cache Swap cache stats: add 188657, delete 179415, find 5968/13378 Free swap = 1572624kB Total swap = 2104472kB 1032176 pages RAM 18897 pages reserved 30633 pages shared 990307 pages non-shared -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User delder@novacoast.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c29
--- Comment #29 from Dan Elder
https://bugzilla.novell.com/show_bug.cgi?id=444597
User delder@novacoast.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c30
--- Comment #30 from Dan Elder
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c31
--- Comment #31 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User hare@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c32
Hannes Reinecke
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c33
Miklos Szeredi
This is definitely still present in beta 6 and sometimes when it happens RAM just disappears from my system. I have 4 GB of RAM and nothing but a gnome desktop with firefox running and the system is out of memory. Interestingly, when it started having problems the system was basically locked up in i/o until I was very slowly able to kill -9 firefox. I'm not sure what all the i/o was for as I didn't have any swap at that time. After creating a 1 GB swap file the system immediately dug into it although again I don't know what for:
This is interesting. The slabinfo might contain a few clues, the top three slab users are:
size-128 4899 8529210 128 30 1 : tunables 120 60 8 : slabdata 190 284307 0
(1110M total)
Acpi-ParseExt 3 14115543 72 53 1 : tunables 120 60 8 : slabdata 1 266331
(1040M total)
revoke_record 0 12968928 32 112 1 : tunables 120 60 8 : slabdata 0 115794 0
(452M total) None of them are actually used much at the moment (active_objs vs. num_objs), and the VM hasn't yet gotten around to reclaiming those slab pages. But at one point in time it seems there were a huge number of allocations of these slab objects, which doesn't really look normal. We can't tell what the size-128 allocations are for. Acpi-ParseExt: Thomas, any idea what's this? revoke_record is some JBD thing. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c34
Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c36
--- Comment #36 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User jeffm@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c37
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c38
--- Comment #38 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c39
Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User nfbrown@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c40
--- Comment #40 from Neil Brown
https://bugzilla.novell.com/show_bug.cgi?id=444597
User nfbrown@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c41
--- Comment #41 from Neil Brown
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c42
--- Comment #42 from Miklos Szeredi
still running - far longer than before. Miklos, I think you're right.
Thanks for testing.
I suggest to get this patch in and submit a new kernel for 11.1 RC3 asap.
The problem with this is: while this might fix the allocation problems, it will probably cause swap over NFS to go into a deadlock once in a while. I still don't fully understand what's happening. My original analysis can't be right, because the number of allocated slab pages are limited by the number of threads doing the allocation. So the huge slab usage is still not fully explained. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User coolo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c43
--- Comment #43 from Stephan Kulow
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c44
--- Comment #44 from Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User coolo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c45
Stephan Kulow
https://bugzilla.novell.com/show_bug.cgi?id=444597
User coolo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c46
--- Comment #46 from Stephan Kulow
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c49
--- Comment #49 from Miklos Szeredi
from reading the kernel mailing list, I figure we already have it. So it sounds like a good approach
I've tried to revert the whole SoN patchset, but it's not that simple, because it's in the middle of series.conf and some other patches depend on it. In the end I reverted only the patches which are causing these bugs and the NFS swapfile implementation itself: patches.suse/SoN-08-reserve-slub.patch patches.suse/SoN-26-mm-swapfile.patch patches.suse/SoN-27-mm-page_file_methods.patch patches.suse/SoN-28-nfs-swapcache.patch patches.suse/SoN-29-nfs-swapper.patch patches.suse/SoN-30-nfs-swap_ops.patch If this sounds acceptable I will commit this onto SL111_BRANCH. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c51
--- Comment #51 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c52
--- Comment #52 from Suresh Jayaraman
(In reply to comment #46 from Stephan Kulow)
from reading the kernel mailing list, I figure we already have it. So it sounds like a good approach
I've tried to revert the whole SoN patchset, but it's not that simple, because it's in the middle of series.conf and some other patches depend on it.
There is a conflict with patches.xen/xen3-auto-common.diff I think and no other dependencies AFAICT.
In the end I reverted only the patches which are causing these bugs and the NFS swapfile implementation itself:
patches.suse/SoN-08-reserve-slub.patch patches.suse/SoN-26-mm-swapfile.patch patches.suse/SoN-27-mm-page_file_methods.patch patches.suse/SoN-28-nfs-swapcache.patch patches.suse/SoN-29-nfs-swapper.patch patches.suse/SoN-30-nfs-swap_ops.patch
If this sounds acceptable I will commit this onto SL111_BRANCH.
If disabling entire patchset is ok, I could do that for SL111 probably by adding a tag -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=444597
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c53
--- Comment #53 from Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c54
--- Comment #54 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c55
--- Comment #55 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c56
--- Comment #56 from Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c57
--- Comment #57 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c58
--- Comment #58 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c59
--- Comment #59 from Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c60
--- Comment #60 from Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c61
--- Comment #61 from Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=444597
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c62
Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=444597
User jeffm@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c63
--- Comment #63 from Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c64
--- Comment #64 from Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c65
Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c66
LTC BugProxy
From var/log/messages output ,it looks like call traces got generated against all the file systems (say ext2,ext3,xfs,reiserfs).
Thanks...
Manas
=Comment: #3=================================================
Manas K. Nayak
From /var/log/messages it looks like call traces got generated only against ext3,xfs filesystems. Here are some call traces:
Call trace against XFS:
-----------------------------
Nov 20 01:31:12 mhs21a kernel: Free swap = 9221188kB
Nov 20 01:31:12 mhs21a kernel: Total swap = 9221300kB
Nov 20 01:31:12 mhs21a kernel: 1048576 pages RAM
Nov 20 01:31:12 mhs21a kernel: 67987 pages reserved
Nov 20 01:31:12 mhs21a kernel: 410164 pages shared
Nov 20 01:31:12 mhs21a kernel: 586441 pages non-shared
Nov 20 01:31:12 mhs21a kernel: ln: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7,
pflags:0x420000
Nov 20 01:31:12 mhs21a kernel: Pid: 6441, comm: ln Not tainted
2.6.27.5-2-default #1
Nov 20 01:31:12 mhs21a kernel:
Nov 20 01:31:12 mhs21a kernel: Call Trace:
Nov 20 01:31:12 mhs21a kernel: [<ffffffff8020e53e>]
show_trace_log_lvl+0x41/0x58
Nov 20 01:31:12 mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
Nov 20 01:31:12 mhs21a kernel: [<ffffffff8028e8d2>]
__alloc_pages_internal+0x3eb/0x40b
Nov 20 01:31:12 mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
Nov 20 01:31:12 mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
Nov 20 01:31:12 mhs21a kernel: [<ffffffff802b47b1>]
kmem_cache_alloc+0x137/0x16c
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa002dde2>]
scsi_pool_alloc_command+0x14/0x5b [scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa002de3b>]
scsi_host_alloc_command+0x12/0x56 [scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa002df34>]
__scsi_get_command+0xc/0x7a [scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa002dfd2>] scsi_get_command+0x30/0x96
[scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa003366f>]
scsi_setup_fs_cmnd+0x69/0xb8 [scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa0184155>] sd_prep_fn+0x65/0x860
[sd_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffff80345483>]
elv_next_request+0x153/0x20c
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa0032abe>] scsi_request_fn+0x88/0x52b
[scsi_mod]
Nov 20 01:31:12 mhs21a kernel: [<ffffffff80345639>] elv_insert+0xfd/0x2b2
Nov 20 01:31:12 mhs21a kernel: [<ffffffff80348265>] __make_request+0x41a/0x499
Nov 20 01:31:12 mhs21a kernel: [<ffffffff803468f2>]
generic_make_request+0x39f/0x3e2
Nov 20 01:31:12 mhs21a kernel: [<ffffffff803469f2>] submit_bio+0xbd/0xc4
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04df1f0>]
_xfs_buf_ioapply+0x200/0x22b [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04dff39>]
xfs_buf_iorequest+0x36/0x61 [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04cb03a>] xlog_bdstrat_cb+0x16/0x3c
[xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04c8fa6>] xlog_sync+0x24a/0x3ec
[xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04ca26f>]
xlog_state_sync+0x1a7/0x2e8 [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04ca41b>] _xfs_log_force+0x6b/0x70
[xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04ca42b>] xfs_log_force+0xb/0x2a
[xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04997a7>]
xfs_alloc_ag_vextent+0x92/0xfd [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa0499f72>]
xfs_alloc_vextent+0x2c1/0x3f7 [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04a69d2>]
xfs_bmap_btalloc+0x75a/0x9a8 [xfs]
Nov 20 01:31:12 mhs21a kernel: [<ffffffffa04a98bc>] xfs_bmapi+0x8c9/0x10c0
[xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04b26d1>]
xfs_dir2_grow_inode+0xde/0x2ff [xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04b3213>]
xfs_dir2_sf_to_block+0x9c/0x53d [xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04b9ed1>]
xfs_dir2_sf_addname+0x197/0x2f1 [xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04b3052>]
xfs_dir_createname+0xdf/0x157 [xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04d91a4>] xfs_symlink+0x69f/0x886
[xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffffa04e3a06>] xfs_vn_symlink+0x6e/0xba
[xfs]
Nov 20 01:31:13 mhs21a kernel: [<ffffffff802c2cab>] vfs_symlink+0x127/0x1a1
Nov 20 01:31:13 mhs21a kernel: [<ffffffff802c577e>] sys_symlinkat+0x80/0xd6
Nov 20 01:31:13 mhs21a kernel: [<ffffffff8020c3fa>]
system_call_fastpath+0x16/0x1b
Nov 20 01:31:13 mhs21a kernel: [<00007f8c1c223f67>] 0x7f8c1c223f67
Call traces against EXT3:
--------------------------------
Nov 19 11:32:24 mhs21a kernel: Total swap = 9221300kB
Nov 19 11:32:24 mhs21a kernel: 1048576 pages RAM
Nov 19 11:32:24 mhs21a kernel: 67987 pages reserved
Nov 19 11:32:24 mhs21a kernel: 525953 pages shared
Nov 19 11:32:24 mhs21a kernel: 464543 pages non-shared
Nov 19 11:32:24 mhs21a kernel: Neighbour table overflow.
Nov 19 11:45:40 mhs21a kernel: cat: page allocation failure. order:0,
mode:0x20, alloc_flags:0x7,
pflags:0x400000
Nov 19 11:45:40 mhs21a kernel: Pid: 22650, comm: cat Not tainted
2.6.27.5-2-default #1
Nov 19 11:45:40 mhs21a kernel:
Nov 19 11:45:40 mhs21a kernel: Call Trace:
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8020e53e>]
show_trace_log_lvl+0x41/0x58
Nov 19 11:45:40 mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8028e8d2>]
__alloc_pages_internal+0x3eb/0x40b
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b3acc>]
kmem_cache_alloc_node+0xdd/0x113
Nov 19 11:45:40 mhs21a kernel: [<ffffffff80349dba>] alloc_io_context+0x16/0x83
Nov 19 11:45:40 mhs21a kernel: [<ffffffff80349e42>]
current_io_context+0x1b/0x29
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8034794c>] get_request+0x71/0x3be
Nov 19 11:45:40 mhs21a kernel: [<ffffffff80347cc8>]
get_request_wait+0x2f/0x1b2
Nov 19 11:45:40 mhs21a kernel: [<ffffffff803481c8>] __make_request+0x37d/0x499
Nov 19 11:45:40 mhs21a kernel: [<ffffffff803468f2>]
generic_make_request+0x39f/0x3e2
Nov 19 11:45:40 mhs21a kernel: [<ffffffff803469f2>] submit_bio+0xbd/0xc4
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802dd0cd>] submit_bh+0xde/0xfe
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802df968>]
__block_write_full_page+0x1ca/0x2b2
Nov 19 11:45:40 mhs21a kernel: [<ffffffffa014904d>]
ext3_ordered_writepage+0xc0/0x134 [ext3]
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8028ea83>] __writepage+0xa/0x25
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8028f383>]
write_cache_pages+0x179/0x2c3
Nov 19 11:45:41 mhs21a kernel: [<ffffffff8028f510>] do_writepages+0x27/0x2d
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802d8ef9>]
__sync_single_inode+0x72/0x259
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802d9228>]
__writeback_single_inode+0x148/0x155
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802d96ca>]
generic_sync_sb_inodes+0x290/0x3f4
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802d9b14>]
writeback_inodes+0xa0/0x108
Nov 19 11:45:41 mhs21a kernel: [<ffffffff8028fc82>]
balance_dirty_pages+0x133/0x2bd
Nov 19 11:45:41 mhs21a kernel: [<ffffffff80287bfa>]
generic_perform_write+0x178/0x1a8
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802897cb>]
generic_file_buffered_write+0x82/0x12c
Nov 19 11:45:41 mhs21a kernel: [<ffffffff80289d72>]
__generic_file_aio_write_nolock+0x349/0x37d
Nov 19 11:45:41 mhs21a kernel: [<ffffffff8028a114>]
generic_file_aio_write+0x64/0xc4
Nov 19 11:45:41 mhs21a kernel: [<ffffffffa01465e5>] ext3_file_write+0x16/0x95
[ext3]
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802ba8e1>] do_sync_write+0xce/0x113
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802bb1b6>] vfs_write+0xad/0x156
Nov 19 11:45:41 mhs21a kernel: [<ffffffff802bb31b>] sys_write+0x45/0x6e
Nov 19 11:45:41 mhs21a kernel: [<ffffffff8020c3fa>]
system_call_fastpath+0x16/0x1b
Nov 19 11:45:41 mhs21a kernel: [<00007f19deda5950>] 0x7f19deda5950
Nov 19 11:45:41 mhs21a kernel:
Nov 19 11:45:41 mhs21a kernel: Mem-Info:
Nov 19 11:45:41 mhs21a kernel: Node 0 DMA per-cpu:
Nov 19 11:45:41 mhs21a kernel: CPU 0: hi: 0, btch: 1 usd: 0
Nov 19 11:45:42 mhs21a kernel: CPU 1: hi: 0, btch: 1 usd: 0
Nov 19 11:45:42 mhs21a kernel: CPU 2: hi: 0, btch: 1 usd: 0
Nov 19 11:45:42 mhs21a kernel: CPU 3: hi: 0, btch: 1 usd: 0
Compared earlier reported call traces and It looks like call traces are
similar only upto the
followings:
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8020e53e>]
show_trace_log_lvl+0x41/0x58
Nov 19 11:45:40 mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
Nov 19 11:45:40 mhs21a kernel: [<ffffffff8028e8d2>]
__alloc_pages_internal+0x3eb/0x40b
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
Nov 19 11:45:40 mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
To get more information ,please see attached file of var/log/messages and dmesg
output for fsracer
test results.
Thanks...
Manas
=Comment: #4=================================================
Manas K. Nayak
From the mainline code, I would have expected this warning to be supressed but something different might be happening in the SLES kernel. What's the best way to get a look at the SLES kernel source again? I forget :(
I'm skipping the second unique failure because it's like the one above except
that the callback is
in a different order which is probably just a mistake.
entry 2 count 1
===============================
mhs21a kernel: fsstress: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x400040
mhs21a kernel: Pid comm: fsstress Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b3acc>] kmem_cache_alloc_node+0xdd/0x113
mhs21a kernel: [<ffffffff80349dba>] alloc_io_context+0x16/0x83
mhs21a kernel: [<ffffffff80349e42>] current_io_context+0x1b/0x29
mhs21a kernel: [<ffffffff8034794c>] get_request+0x71/0x3be
mhs21a kernel: [<ffffffff80347cc8>] get_request_wait+0x2f/0x1b2
mhs21a kernel: [<ffffffff803481c8>] __make_request+0x37d/0x499
mhs21a kernel: [<ffffffff803468f2>] generic_make_request+0x39f/0x3e2
mhs21a kernel: [<ffffffff803469f2>] submit_bio+0xbd/0xc4
mhs21a kernel: [<ffffffff802e401a>] mpage_bio_submit+0x22/0x26
mhs21a kernel: [<ffffffff802e4073>] mpage_writepages+0x55/0x5d
mhs21a kernel: [<ffffffff8028f509>] do_writepages+0x20/0x2d
mhs21a kernel: [<ffffffff802d8ef9>] __sync_single_inode+0x72/0x259
mhs21a kernel: [<ffffffff802d9228>] __writeback_single_inode+0x148/0x155
mhs21a kernel: [<ffffffff802d96ca>] generic_sync_sb_inodes+0x290/0x3f4
mhs21a kernel: [<ffffffff802d98d0>] sync_inodes_sb+0x8a/0x8f
<SNIP>
Another atomic allocation. Same type of deal whereby we are below watermarks
and there is not much
the allocator can do other than fail. This time, it's get_request_wait() that
has the necessary
smarts to go onto a wait-queue and wait for IO to complete until
current_io_context() gives
something useful. Same as above basically, you wait around a bit but you don't
die. Again, not sure
why this warning is not supressed - possibly an oversight.
Next useful one;
entry 4 count 1
===============================
mhs21a kernel: fsstress: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x420140
mhs21a kernel: Pid comm: fsstress Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b47b1>] kmem_cache_alloc+0x137/0x16c
mhs21a kernel: [<ffffffffa002dde2>] scsi_pool_alloc_command+0x14/0x5b
[scsi_mod]
mhs21a kernel: [<ffffffffa002de3b>] scsi_host_alloc_command+0x12/0x56
[scsi_mod]
mhs21a kernel: [<ffffffffa002df34>] __scsi_get_command+0xc/0x7a [scsi_mod]
mhs21a kernel: [<ffffffffa002dfd2>] scsi_get_command+0x30/0x96 [scsi_mod]
mhs21a kernel: [<ffffffffa003366f>] scsi_setup_fs_cmnd+0x69/0xb8 [scsi_mod]
mhs21a kernel: [<ffffffffa0184155>] sd_prep_fn+0x65/0x860 [sd_mod]
mhs21a kernel: [<ffffffff80345483>] elv_next_request+0x153/0x20c
mhs21a kernel: [<ffffffffa0032abe>] scsi_request_fn+0x88/0x52b [scsi_mod]
mhs21a kernel: [<ffffffff80348265>] __make_request+0x41a/0x499
mhs21a kernel: [<ffffffff803468f2>] generic_make_request+0x39f/0x3e2
mhs21a kernel: [<ffffffff803469f2>] submit_bio+0xbd/0xc4
mhs21a kernel: [<ffffffffa046b1f0>] _xfs_buf_ioapply+0x200/0x22b [xfs]
mhs21a kernel: [<ffffffffa046bf39>] xfs_buf_iorequest+0x36/0x61 [xfs]
mhs21a kernel: [<ffffffffa045703a>] xlog_bdstrat_cb+0x16/0x3c [xfs]
mhs21a kernel: [<ffffffffa0454fa6>] xlog_sync+0x24a/0x3ec [xfs]
mhs21a kernel: [<ffffffffa04568f8>] xlog_write+0x336/0x4a1 [xfs]
Same atomic request. Same job with watermarks. scsi_setup_fs_cmnd returns
BLKPREP_DEFER in this
case, you get replugged later. Blah blah blah, delayed no die.
Entry 5 is same (setup_scsi_fs_cmnd path) as entry 4 except kblockd hit it
Entry 6 is same (setup_scsi_fs_cmnd path) except XFS was the source this time,
same delaying end
result.
Entry 7 is same as entry 6 except we took a slightly different path, ended up
in same place
Entry 8 scsi_fs_cmnd through yet another path - direct IO this time
Entry 9 same idea
Oh, entry 10 is intereseting
entry 10 count 5
===============================
mhs21a kernel: dbench: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x402000
mhs21a kernel: Pid comm: dbench Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b47b1>] kmem_cache_alloc+0x137/0x16c
mhs21a kernel: [<ffffffff8024c652>] send_signal+0xed/0x240
mhs21a kernel: [<ffffffff8024d0dc>] group_send_sig_info+0x48/0x6f
mhs21a kernel: [<ffffffff8024d20c>] __kill_pgrp_info+0x42/0x67
mhs21a kernel: [<ffffffff8024d2ee>] kill_something_info+0x84/0xef
mhs21a kernel: [<ffffffff8024d3c3>] sys_kill+0x6a/0x76
mhs21a kernel: [<ffffffff8022cab5>] sysenter_dispatch+0x7/0x46
mhs21a kernel:
Atomic whatever but can't send a signal...... Userspace gets EAGAIN which they
should handle.
Entry 11 is interesting as well
entry 11 count 1
===============================
mhs21a kernel: fsstress: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x400140
mhs21a kernel: Pid comm: fsstress Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b47b1>] kmem_cache_alloc+0x137/0x16c
mhs21a kernel: [<ffffffff8024c652>] send_signal+0xed/0x240
mhs21a kernel: [<ffffffff8024d0dc>] group_send_sig_info+0x48/0x6f
mhs21a kernel: [<ffffffff8024d134>] kill_pid_info+0x31/0x3b
mhs21a kernel: [<ffffffff8024553d>] it_real_fn+0x17/0x1e
mhs21a kernel: [<ffffffff80256d67>] run_hrtimer_pending+0x78/0x13a
mhs21a kernel: [<ffffffff8024659d>] __do_softirq+0x84/0x115
mhs21a kernel: [<ffffffff8020ddac>] call_softirq+0x1c/0x28
mhs21a kernel: [<ffffffff8020f177>] do_softirq+0x3c/0x81
mhs21a kernel: [<ffffffff802462b4>] irq_exit+0x3f/0x83
mhs21a kernel: [<ffffffff8021cce0>] smp_apic_timer_interrupt+0x92/0xaa
mhs21a kernel: [<ffffffff8020d523>] apic_timer_interrupt+0x83/0x90
mhs21a kernel: [<ffffffff80292ee2>] shrink_inactive_list+0x459/0x481
mhs21a kernel: [<ffffffff8029300d>] shrink_zone+0x103/0x126
mhs21a kernel: [<ffffffff80293360>] shrink_zones+0xe2/0x119
mhs21a kernel: [<ffffffff80293fca>] do_try_to_free_pages+0x150/0x2a3
mhs21a kernel: [<ffffffff80294201>] try_to_free_pages+0x60/0x65
mhs21a kernel: [<ffffffff8028e74c>] __alloc_pages_internal+0x265/0x40b
mhs21a kernel: [<ffffffff802894d2>] find_or_create_page+0x32/0x71
mhs21a kernel: [<ffffffffa046ba01>] _xfs_buf_lookup_pages+0x10b/0x2e0 [xfs]
mhs21a kernel: [<ffffffffa046c5fb>] xfs_buf_get_flags+0x6d/0x147 [xfs]
mhs21a kernel: [<ffffffffa046c6e7>] xfs_buf_read_flags+0x12/0x81 [xfs]
mhs21a kernel: [<ffffffffa0461acd>] xfs_trans_read_buf+0x47/0x2af [xfs]
mhs21a kernel: [<ffffffffa044f057>] xfs_imap_to_bp+0x3f/0xfa [xfs]
mhs21a kernel: [<ffffffffa044f24c>] xfs_itobp+0xa0/0xe7 [xfs]
mhs21a kernel: [<ffffffffa04512ce>] xfs_iread+0x79/0x1eb [xfs]
mhs21a kernel: [<ffffffffa044c8e8>] xfs_iget_core+0x3b1/0x685 [xfs]
mhs21a kernel: [<ffffffffa044cc9e>] xfs_iget+0xe2/0x188 [xfs]
mhs21a kernel: [<ffffffffa0466734>] xfs_lookup+0x79/0xa5 [xfs]
mhs21a kernel: [<ffffffffa046f3d7>] xfs_vn_lookup+0x3c/0x78 [xfs]
Might have lost a clock even there, don't know for sure.
Entry 12 similar to 11.
Entry 13 similar to 11.
Entry 14 similar to 11.
Entry 15 similar to 11.
Entry 16 is in network subsystem. It handles the failure, but not sure what the
consequences are. I
would guess more delays but that's based on belief in that this sort of failure
is handled
everywhere else than evidence. Maybe packets get dropped and resent later
Entry 17 is similar idea to 16
entry 18 count 3
===============================
mhs21a kernel: fsstress: page allocation failure. order:0, mode:0x20,
alloc_flags:0x7, pflags:0x400140
mhs21a kernel: Pid comm: fsstress Not tainted 2.6.27.5-2-default #1
mhs21a kernel:
mhs21a kernel: Call Trace:
mhs21a kernel: [<ffffffff8020e53e>] show_trace_log_lvl+0x41/0x58
mhs21a kernel: [<ffffffff804a8247>] dump_stack+0x69/0x6f
mhs21a kernel: [<ffffffff8028e8d2>] __alloc_pages_internal+0x3eb/0x40b
mhs21a kernel: [<ffffffff802b3934>] kmem_getpages+0x6f/0x12a
mhs21a kernel: [<ffffffff802b42a2>] fallback_alloc+0x15a/0x20a
mhs21a kernel: [<ffffffff802b44be>] __kmalloc+0x16c/0x1a1
mhs21a kernel: [<ffffffffa04ae70d>] reiserfs_get_block+0xca1/0xf0e [reiserfs]
mhs21a kernel: [<ffffffffa04ae9bb>] reiserfs_get_blocks_direct_io+0x41/0x95
[reiserfs]
mhs21a kernel: [<ffffffff802e35a7>] do_direct_IO+0x147/0x369
mhs21a kernel: [<ffffffff802e3a81>] direct_io_worker+0x174/0x309
mhs21a kernel: [<ffffffff802e3e87>] __blockdev_direct_IO+0x271/0x2c3
mhs21a kernel: [<ffffffffa04aae27>] reiserfs_direct_IO+0x4c/0x51 [reiserfs]
mhs21a kernel: [<ffffffff80289976>] generic_file_direct_write+0x101/0x1b4
mhs21a kernel: [<ffffffff80289cbe>]
__generic_file_aio_write_nolock+0x295/0x37d
mhs21a kernel: [<ffffffff8028a114>] generic_file_aio_write+0x64/0xc4
mhs21a kernel: [<ffffffff802ba8e1>] do_sync_write+0xce/0x113
mhs21a kernel: [<ffffffff802bb1b6>] vfs_write+0xad/0x156
mhs21a kernel: [<ffffffff802bb31b>] sys_write+0x45/0x6e
mhs21a kernel: [<ffffffff8022cab5>] sysenter_dispatch+0x7/0x46
mhs21a kernel: [<00000000ffffe430>] 0xffffe430
Ran out of time looking at this one. I suspect userspace gets a 0 and they try
again, but I don't
know fur 100% sure.
Bottom line, I suspect most of not all of these are harmless (unless you view
delays as not
harmless), but alarming if you were reading the logs. Probably should get a
filesystems expert to
double check the analysis to be sure these are really harmless and if they
should be suppressed.
=Comment: #18=================================================
Richard A. Lary
https://bugzilla.novell.com/show_bug.cgi?id=444597
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c67
--- Comment #67 from LTC BugProxy
https://bugzilla.novell.com/show_bug.cgi?id=444597
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c68
--- Comment #68 from LTC BugProxy
https://bugzilla.novell.com/show_bug.cgi?id=444597
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c69
--- Comment #69 from LTC BugProxy
https://bugzilla.novell.com/show_bug.cgi?id=444597
User coolo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c70
Stephan Kulow
https://bugzilla.novell.com/show_bug.cgi?id=444597
User mszeredi@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c71
--- Comment #71 from Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=444597
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c72
--- Comment #72 from LTC BugProxy
https://bugzilla.novell.com/show_bug.cgi?id=444597
User gregkh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c73
Greg Kroah-Hartman
https://bugzilla.novell.com/show_bug.cgi?id=444597
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c74
--- Comment #74 from LTC BugProxy
https://bugzilla.novell.com/show_bug.cgi?id=444597
User bugproxy@us.ibm.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c75
--- Comment #75 from LTC BugProxy
https://bugzilla.novell.com/show_bug.cgi?id=444597
User sjayaraman@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c76
Suresh Jayaraman
https://bugzilla.novell.com/show_bug.cgi?id=444597
User swamp@suse.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c78
Swamp Script User
https://bugzilla.novell.com/show_bug.cgi?id=444597
User swamp@suse.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=444597#c79
Swamp Script User
participants (1)
-
bugzilla_noreply@novell.com