[opensuse-kernel] Constant crash under a 12.1-default kernel

27 Nov 2012

      I pick several of those per day 

[867184.522832] The following is only an harmless informational message.
[867184.522841] Unless you get a _continuous_flood_ of these messages it means
[867184.522846] everything is working fine. Allocations from irqs cannot be
[867184.522850] perfectly reliable and the kernel is designed to handle that.
[867184.522857] kworker/0:1: page allocation failure. order:1, mode:0x20
[867184.522865] Pid: 0, comm: kworker/0:1 Not tainted 3.1.10-1.16-default #1
[867184.522870] Call Trace:
[867184.522900]  [<ffffffff810042fa>] dump_trace+0x9a/0x270
[867184.522915]  [<ffffffff81527ce8>] dump_stack+0x69/0x6f
[867184.522929]  [<ffffffff810f5d6e>] warn_alloc_failed+0xfe/0x1b0
[867184.522944]  [<ffffffff810f9e96>] __alloc_pages_nodemask+0x6b6/0x900
[867184.522959]  [<ffffffff81136a68>] kmem_getpages+0x48/0x150
[867184.522971]  [<ffffffff8113773d>] fallback_alloc+0x17d/0x230
[867184.522984]  [<ffffffff81137fa3>] kmem_cache_alloc+0x143/0x1a0
[867184.522999]  [<ffffffff81415783>] sk_prot_alloc+0x43/0x200
[867184.523012]  [<ffffffff8141650a>] sk_clone+0x1a/0x300
[867184.523023]  [<ffffffff81461d50>] inet_csk_clone+0x10/0xc0
[867184.523037]  [<ffffffff8147a211>] tcp_create_openreq_child+0x21/0x460
[867184.523050]  [<ffffffff81478817>] tcp_v4_syn_recv_sock+0x47/0x280
[867184.523062]  [<ffffffff8147adaf>] tcp_check_req+0x2ff/0x4c0
[867184.523075]  [<ffffffff81477a8a>] tcp_v4_hnd_req+0x6a/0x130
[867184.523087]  [<ffffffff81478513>] tcp_v4_do_rcv+0x123/0x1d0
[867184.523100]  [<ffffffff81479cd9>] tcp_v4_rcv+0x559/0x810
[867184.523114]  [<ffffffff814575ec>] ip_local_deliver_finish+0xcc/0x270
[867184.523125]  [<ffffffff81426cb2>] __netif_receive_skb+0x482/0x520
[867184.523138]  [<ffffffff814274d0>] netif_receive_skb+0x70/0x80
[867184.523149]  [<ffffffff814275e8>] napi_skb_finish+0x38/0x50
[867184.523191]  [<ffffffffa0215067>] e1000_clean_rx_irq+0x2a7/0x440 [e1000e]
[867184.523303]  [<ffffffffa0217444>] e1000_clean+0x74/0x130 [e1000e]
[867184.523393]  [<ffffffff81427cf9>] net_rx_action+0x119/0x260
[867184.523406]  [<ffffffff8105927d>] __do_softirq+0x9d/0x1f0
[867184.523420]  [<ffffffff8154a02c>] call_softirq+0x1c/0x30
[867184.523432]  [<ffffffff81004175>] do_softirq+0x65/0xa0
[867184.523443]  [<ffffffff81059656>] irq_exit+0x96/0xc0
[867184.523456]  [<ffffffff81003d9c>] do_IRQ+0x5c/0xd0
[867184.523468]  [<ffffffff815403ee>] common_interrupt+0x6e/0x6e
[867184.523485]  [<ffffffff81029af2>] native_safe_halt+0x2/0x10
[867184.523497]  [<ffffffff8100a21d>] default_idle+0x4d/0x1d0
[867184.523509]  [<ffffffff8100a42b>] amd_e400_idle+0x8b/0x110
[867184.523522]  [<ffffffff8100119f>] cpu_idle+0x7f/0xd0
[867184.523542] The following is only an harmless informational message.
[867184.523547] Unless you get a _continuous_flood_ of these messages it means
[867184.523552] everything is working fine. Allocations from irqs cannot be
[867184.523557] perfectly reliable and the kernel is designed to handle that.
[867184.523563] kworker/0:1: page allocation failure. order:1, mode:0x20
[867184.523571] Pid: 0, comm: kworker/0:1 Not tainted 3.1.10-1.16-default #1
[867184.523576] Call Trace:
[867184.523589]  [<ffffffff810042fa>] dump_trace+0x9a/0x270
[867184.523601]  [<ffffffff81527ce8>] dump_stack+0x69/0x6f
[867184.523612]  [<ffffffff810f5d6e>] warn_alloc_failed+0xfe/0x1b0
[867184.523625]  [<ffffffff810f9e96>] __alloc_pages_nodemask+0x6b6/0x900
[867184.523639]  [<ffffffff81136a68>] kmem_getpages+0x48/0x150
[867184.523650]  [<ffffffff8113773d>] fallback_alloc+0x17d/0x230
[867184.523664]  [<ffffffff81137fa3>] kmem_cache_alloc+0x143/0x1a0
[867184.523677]  [<ffffffff81415783>] sk_prot_alloc+0x43/0x200
[867184.523690]  [<ffffffff8141650a>] sk_clone+0x1a/0x300
[867184.523701]  [<ffffffff81461d50>] inet_csk_clone+0x10/0xc0
[867184.523713]  [<ffffffff8147a211>] tcp_create_openreq_child+0x21/0x460
[867184.523727]  [<ffffffff81478817>] tcp_v4_syn_recv_sock+0x47/0x280
[867184.523740]  [<ffffffff8147adaf>] tcp_check_req+0x2ff/0x4c0
[867184.523753]  [<ffffffff81477a8a>] tcp_v4_hnd_req+0x6a/0x130
[867184.523766]  [<ffffffff81478513>] tcp_v4_do_rcv+0x123/0x1d0
[867184.523779]  [<ffffffff81479cd9>] tcp_v4_rcv+0x559/0x810
[867184.523792]  [<ffffffff814575ec>] ip_local_deliver_finish+0xcc/0x270
[867184.523803]  [<ffffffff81426cb2>] __netif_receive_skb+0x482/0x520
[867184.523817]  [<ffffffff814274d0>] netif_receive_skb+0x70/0x80
[867184.523828]  [<ffffffff814275e8>] napi_skb_finish+0x38/0x50
[867184.523857]  [<ffffffffa0215067>] e1000_clean_rx_irq+0x2a7/0x440 [e1000e]
[867184.523959]  [<ffffffffa0217444>] e1000_clean+0x74/0x130 [e1000e]
[867184.524048]  [<ffffffff81427cf9>] net_rx_action+0x119/0x260
[867184.524061]  [<ffffffff8105927d>] __do_softirq+0x9d/0x1f0
[867184.524073]  [<ffffffff8154a02c>] call_softirq+0x1c/0x30
[867184.524085]  [<ffffffff81004175>] do_softirq+0x65/0xa0
[867184.524096]  [<ffffffff81059656>] irq_exit+0x96/0xc0
[867184.524109]  [<ffffffff81003d9c>] do_IRQ+0x5c/0xd0
[867184.524120]  [<ffffffff815403ee>] common_interrupt+0x6e/0x6e
[867184.524135]  [<ffffffff81029af2>] native_safe_halt+0x2/0x10
[867184.524146]  [<ffffffff8100a21d>] default_idle+0x4d/0x1d0
[867184.524158]  [<ffffffff8100a42b>] amd_e400_idle+0x8b/0x110
[867184.524170]  [<ffffffff8100119f>] cpu_idle+0x7f/0xd0

What would be the best approach to start digging those crash ?
Unfortunately, this is on a production server and thus not that easy to stop or run in full debug mode.

Thanks for any input

-- 
Bruno Friedmann 
Ioda-Net Sàrl www.ioda-net.ch

openSUSE Member & Ambassador
GPG KEY : D5C9B751C4653227
irc: tigerfoot
-- 
To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org

Bruno Friedmann

Michal Hocko

Bruno Friedmann

Michal Hocko

Jiri Bohac

tags

participants (3)