Mailinglist Archive: opensuse (1239 mails)

< Previous Next >
[opensuse] Re: kthread hun, controller reset...(kernel tracebacks)
Cristian Rodr�������������������������������� wrote:
El 22/06/13 23:36, Constant Brouerius van Nidek escribi�:

BUG: unable to handle kernel paging request at 76beb088


That explains it.. there are two possibilities here:

a) In the last round of updates, you got a buggy kernel update with a regression or ..

b) if there was no kernel update, your system's RAM is likely broken, to discard this possibility run memtest.
---
I just got a kernel fault today as well but mine was an unspecified
error in one of the disk subsystems
First got a message one of the worker threads was hung and all
the things that locks were in:
[25586.642767] INFO: task kworker/9:2:821 blocked for more than 120 seconds.
[25586.649641] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25586.657593] kworker/9:2 D ffff880613e28000 4744 821 2 0x00000000
[25586.657604] ffff88060c393c68 0000000000000046 ffff88060c393fd8
00000000001d3a00
[25586.657609] ffff88060c393fd8 00000000001d3a00 ffff8806124222f0
ffff880617bcfc20
[25586.657626] ffff8806101781a8
[25586.657627] ffff8806124222f0
[25586.657628] 0000000000000000
[25586.657628] 0000000000000009

[25586.657630] Call Trace:

[25586.657639] [<ffffffff815d2834>] schedule+0x24/0x70
[25586.657648] [<ffffffff81292360>] _xfs_log_force_lsn+0x2f0/0x330
[25586.657655] [<ffffffff810742c0>] ? try_to_wake_up+0x350/0x350
[25586.657660] [<ffffffff8128e6db>] xfs_trans_commit+0x24b/0x260
[25586.657666] [<ffffffff8123d9cb>] xfs_fs_log_dummy+0x5b/0x70
[25586.657672] [<ffffffff81292060>] xfs_log_worker+0x40/0x50
[25586.657682] [<ffffffff8105bb48>] process_one_work+0x1f8/0x570
[25586.657686] [<ffffffff8105bae6>] ? process_one_work+0x196/0x570
[25586.657690] [<ffffffff8105d49b>] worker_thread+0x11b/0x3b0
[25586.657693] [<ffffffff8105d380>] ? manage_workers+0x370/0x370
[25586.657698] [<ffffffff81062279>] kthread+0xd9/0xe0
[25586.657702] [<ffffffff810621a0>] ? kthread_stop+0x160/0x160
[25586.657705] [<ffffffff815db46c>] ret_from_fork+0x7c/0xb0
[25586.657728] [<ffffffff810621a0>] ? kthread_stop+0x160/0x160
[25586.657732] 2 locks held by kworker/9:2/821:
[25586.657743] #0: (xfs-log/%s){......}, at: [<ffffffff8105bae6>] process_one_work+0x196/0x570
[25586.657755] #1: ((&(&log->l_work)->work)){......}, at: [<ffffffff8105bae6>] process_one_work+0x196/0x
570
[25586.657775] INFO: task smbd:3469 blocked for more than 120 seconds.
[25586.657777] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25586.657785] smbd D ffffffff81a10440 2848 3469 3292 0x00000000
[25586.657799] ffff880bfcd0d730 0000000000000046 ffff880bfcd0dfd8
00000000001d3a00
[25586.657811] ffff880bfcd0dfd8 00000000001d3a00 ffff880c11c3c5e0
7fffffffffffffff
[25586.657818] ffff8805fb171930 0000000000000002 0000000000000000
ffff880c11c3c5e0
[25586.657819] Call Trace:
[25586.657821] [<ffffffff815d2834>] schedule+0x24/0x70
[25586.657824] [<ffffffff815cf539>] schedule_timeout+0x229/0x320
Ishtar:law> dmesg|grep 25586 >/tmp/file
Ishtar:law> cat /tmp/file
[25586.642767] INFO: task kworker/9:2:821 blocked for more than 120 seconds.
[25586.649641] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25586.657593] kworker/9:2 D ffff880613e28000 4744 821 2 0x00000000
[25586.657604] ffff88060c393c68 0000000000000046 ffff88060c393fd8
00000000001d3a00
[25586.657609] ffff88060c393fd8 00000000001d3a00 ffff8806124222f0
ffff880617bcfc20
[25586.657626] ffff8806101781a8
[25586.657627] ffff8806124222f0
[25586.657628] 0000000000000000
[25586.657628] 0000000000000009
[25586.657630] Call Trace:
[25586.657639] [<ffffffff815d2834>] schedule+0x24/0x70
[25586.657648] [<ffffffff81292360>] _xfs_log_force_lsn+0x2f0/0x330
[25586.657655] [<ffffffff810742c0>] ? try_to_wake_up+0x350/0x350
[25586.657660] [<ffffffff8128e6db>] xfs_trans_commit+0x24b/0x260
[25586.657666] [<ffffffff8123d9cb>] xfs_fs_log_dummy+0x5b/0x70
[25586.657672] [<ffffffff81292060>] xfs_log_worker+0x40/0x50
[25586.657682] [<ffffffff8105bb48>] process_one_work+0x1f8/0x570
[25586.657686] [<ffffffff8105bae6>] ? process_one_work+0x196/0x570
[25586.657690] [<ffffffff8105d49b>] worker_thread+0x11b/0x3b0
[25586.657693] [<ffffffff8105d380>] ? manage_workers+0x370/0x370
[25586.657698] [<ffffffff81062279>] kthread+0xd9/0xe0
[25586.657702] [<ffffffff810621a0>] ? kthread_stop+0x160/0x160
[25586.657705] [<ffffffff815db46c>] ret_from_fork+0x7c/0xb0
[25586.657728] [<ffffffff810621a0>] ? kthread_stop+0x160/0x160
[25586.657732] 2 locks held by kworker/9:2/821:
[25586.657743] #0: (xfs-log/%s){......}, at: [<ffffffff8105bae6>] process_one_work+0x196/0x570
[25586.657755] #1: ((&(&log->l_work)->work)){......}, at: [<ffffffff8105bae6>] process_one_work+0x196/0x570
[25586.657775] INFO: task smbd:3469 blocked for more than 120 seconds.
[25586.657777] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25586.657785] smbd D ffffffff81a10440 2848 3469 3292 0x00000000
[25586.657799] ffff880bfcd0d730 0000000000000046 ffff880bfcd0dfd8
00000000001d3a00
[25586.657811] ffff880bfcd0dfd8 00000000001d3a00 ffff880c11c3c5e0
7fffffffffffffff
[25586.657818] ffff8805fb171930 0000000000000002 0000000000000000
ffff880c11c3c5e0
[25586.657819] Call Trace:
[25586.657821] [<ffffffff815d2834>] schedule+0x24/0x70
[25586.657824] [<ffffffff815cf539>] schedule_timeout+0x229/0x320
[25586.657827] [<ffffffff81092e3e>] ? put_lock_stats.isra.22+0xe/0x40
[25586.657829] [<ffffffff81092fae>] ? lock_release_holdtime.part.23+0x13e/0x180
[25586.657831] [<ffffffff81071fad>] ? get_parent_ip+0xd/0x50
[25586.657833] [<ffffffff815d1110>] __down_common+0xa5/0xf8
[25586.657836] [<ffffffff81236525>] ? _xfs_buf_find+0x165/0x2e0
[25586.657837] [<ffffffff815d11c2>] __down+0x18/0x1a
[25586.657839] [<ffffffff81067ecc>] down+0x3c/0x50
[25586.657841] [<ffffffff812362a7>] xfs_buf_lock+0x37/0x150
[25586.657842] [<ffffffff81236525>] _xfs_buf_find+0x165/0x2e0
[25586.657844] [<ffffffff812366c5>] xfs_buf_get_map+0x25/0x250
[25586.657847] [<ffffffff812977b1>] xfs_trans_get_buf_map+0x111/0x1d0
[25586.657849] [<ffffffff8126f0fc>] xfs_da_get_buf+0x8c/0xc0
[25586.657852] [<ffffffff81255853>] xfs_attr_leaf_create+0x43/0x180
[25586.657854] [<ffffffff812589a5>] xfs_attr_shortform_to_leaf+0x185/0x310
[25586.657855] [<ffffffff8124d1a7>] ? kmem_zone_alloc+0x67/0xf0
[25586.657857] [<ffffffff8124d256>] ? kmem_zone_zalloc+0x26/0x30
[25586.657859] [<ffffffff81254aaa>] xfs_attr_set_int+0x35a/0x430
[25586.657860] [<ffffffff81254bff>] xfs_attr_set+0x7f/0x90
[25586.657862] [<ffffffff8129b2c4>] xfs_set_acl+0xb4/0x200
[25586.657864] [<ffffffff8129ba38>] xfs_inherit_acl+0x98/0xc0
[25586.657866] [<ffffffff81243e5a>] xfs_vn_mknod+0xda/0x1a0
[25586.657868] [<ffffffff81243f4e>] xfs_vn_create+0xe/0x10
[25586.657871] [<ffffffff81163835>] vfs_create+0x95/0xd0
[25586.657872] [<ffffffff811640a3>] do_last.isra.52+0x833/0xd00
[25586.657874] [<ffffffff81165dba>] path_openat.isra.53+0xaa/0x4d0
[25586.657876] [<ffffffff810943bf>] ? __lock_acquire.isra.30+0x30f/0xa70
[25586.657878] [<ffffffff810943bf>] ? __lock_acquire.isra.30+0x30f/0xa70
[25586.657880] [<ffffffff81076475>] ? sched_clock_cpu+0xb5/0x100
[25586.657882] [<ffffffff81167483>] do_filp_open+0x33/0x80
[25586.657884] [<ffffffff815d3ddc>] ? _raw_spin_unlock+0x2c/0x50
[25586.657887] [<ffffffff811747bf>] ? __alloc_fd+0x10f/0x140
[25586.657889] [<ffffffff811578b0>] do_sys_open+0xe0/0x1c0
[25586.657891] [<ffffffff811579ac>] sys_open+0x1c/0x20
[25586.657892] [<ffffffff815db512>] system_call_fastpath+0x16/0x1b
[25586.657893] 4 locks held by smbd/3469:
[25586.657898] #0: (sb_writers#3){......}, at: [<ffffffff8117690f>] mnt_want_write+0x1f/0x50
[25586.657901] #1: (&type->i_mutex_dir_key){......}, at: [<ffffffff81163b6f>] do_last.isra.52+0x2ff/0xd00
[25586.657904] #2: (sb_internal){......}, at: [<ffffffff8128d69f>] xfs_trans_alloc+0x1f/0x40
[25586.657908] #3: (&(&ip->i_lock)->mr_lock){......}, at: [<ffffffff8127f312>] xfs_ilock+0x112/0x140
[25586.657912] INFO: task squid:5743 blocked for more than 120 seconds.
[25586.657913] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25586.657916] squid D ffff880c1320c5e0 4480 5743 5741 0x00000000
[25586.657918] ffff880bf8fd7b00 0000000000000046 ffff880bf8fd7fd8
00000000001d3a00
[25586.657920] ffff880bf8fd7fd8 00000000001d3a00 ffff880bfbff22f0
7fffffffffffffff
[25586.657922] ffff88060cb02330 0000000000000002 0000000000000000
ffff880bfbff22f0
[25586.657922] Call Trace:
[25586.657924] [<ffffffff815d2834>] schedule+0x24/0x70
[25586.657926] [<ffffffff815cf539>] schedule_timeout+0x229/0x320
[25586.657928] [<ffffffff81092e3e>] ? put_lock_stats.isra.22+0xe/0x40
[25586.657929] [<ffffffff81092fae>] ? lock_release_holdtime.part.23+0x13e/0x180
[25586.657931] [<ffffffff81071fad>] ? get_parent_ip+0xd/0x50
[25586.657932] [<ffffffff815d1110>] __down_common+0xa5/0xf8
[25586.657934] [<ffffffff81236525>] ? _xfs_buf_find+0x165/0x2e0
[25586.657935] [<ffffffff815d11c2>] __down+0x18/0x1a
[25586.657937] [<ffffffff81067ecc>] down+0x3c/0x50
[25586.657938] [<ffffffff812362a7>] xfs_buf_lock+0x37/0x150
[25586.657940] [<ffffffff81236525>] _xfs_buf_find+0x165/0x2e0
[25586.657942] [<ffffffff812366c5>] xfs_buf_get_map+0x25/0x250
[25586.657943] [<ffffffff81237597>] xfs_buf_read_map+0x27/0x1b0
[25586.657945] [<ffffffff81297cb9>] xfs_trans_read_buf_map+0x2b9/0x540
[25586.657946] [<ffffffff8126f550>] xfs_da_read_buf+0xb0/0x200
[25586.657948] [<ffffffff810764ff>] ? local_clock+0x3f/0x50
[25586.657950] [<ffffffff8127287b>] xfs_dir2_block_read+0x2b/0x30
[25586.657952] [<ffffffff8127349e>] xfs_dir2_block_getdents+0x5e/0x1e0
[25586.657953] [<ffffffff81169fa0>] ? filldir64+0xf0/0xf0
[25586.657955] [<ffffffff81169fa0>] ? filldir64+0xf0/0xf0
[25586.657956] [<ffffffff81271ea8>] xfs_readdir+0x118/0x160
[25586.657958] [<ffffffff81169fa0>] ? filldir64+0xf0/0xf0
[25586.657959] [<ffffffff8123abe0>] xfs_file_readdir+0x30/0x40
[25586.657961] [<ffffffff81169da2>] vfs_readdir+0xa2/0xe0
[25586.657963] [<ffffffff8116a181>] sys_getdents+0x81/0x100
[25586.657964] [<ffffffff815db512>] system_call_fastpath+0x16/0x1b
[25586.657965] 1 lock held by squid/5743:
[25586.657968] #0: (&type->i_mutex_dir_key){......}, at: [<ffffffff81169d6a>] vfs_readdir+0x6a/0xe0
[25586.657985] INFO: task find:34976 blocked for more than 120 seconds.
[25586.657986] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25586.657989] find D ffff880c131f22f0 5096 34976 34971 0x00000000
[25586.657991] ffff8806089b78e0 0000000000000046 ffff8806089b7fd8
00000000001d3a00
[25586.657993] ffff8806089b7fd8 00000000001d3a00 ffff8806084545e0
7fffffffffffffff
[25586.657994] ffff880608544188 ffff880608544180 ffff8806084545e0
ffff8806121d2840
[25586.657995] Call Trace:
[25586.657997] [<ffffffff815d2834>] schedule+0x24/0x70
[25586.657998] [<ffffffff815cf539>] schedule_timeout+0x229/0x320
[25586.658000] [<ffffffff81092e3e>] ? put_lock_stats.isra.22+0xe/0x40
[25586.658002] [<ffffffff81092fae>] ? lock_release_holdtime.part.23+0x13e/0x180
[25586.658003] [<ffffffff81071fad>] ? get_parent_ip+0xd/0x50
[25586.658005] [<ffffffff815d1a74>] wait_for_completion+0x94/0x110
[25586.658007] [<ffffffff810742c0>] ? try_to_wake_up+0x350/0x350
[25586.658008] [<ffffffff81237562>] ? _xfs_buf_read+0x32/0x40
[25586.658010] [<ffffffff8123736e>] xfs_buf_iowait+0x2e/0x130
[25586.658012] [<ffffffff81297cb9>] ? xfs_trans_read_buf_map+0x2b9/0x540
[25586.658013] [<ffffffff81237562>] _xfs_buf_read+0x32/0x40
[25586.658015] [<ffffffff81237689>] xfs_buf_read_map+0x119/0x1b0
[25586.658016] [<ffffffff81297cb9>] xfs_trans_read_buf_map+0x2b9/0x540
[25586.658018] [<ffffffff8127fd09>] xfs_imap_to_bp+0x59/0xc0
[25586.658020] [<ffffffff81283675>] xfs_iread+0x65/0x160
[25586.658021] [<ffffffff8123f0ac>] xfs_iget+0x34c/0x990
[25586.658023] [<ffffffff8123ee94>] ? xfs_iget+0x134/0x990
[25586.658025] [<ffffffff8124b226>] xfs_lookup+0x106/0x130
[25586.658026] [<ffffffff81244159>] xfs_vn_lookup+0x49/0x90
[25586.658028] [<ffffffff81161a28>] lookup_real+0x18/0x50
[25586.658030] [<ffffffff811624be>] __lookup_hash+0x2e/0x40
[25586.658032] [<ffffffff815cb36f>] lookup_slow+0x3f/0xa4
[25586.658033] [<ffffffff81166a59>] path_lookupat+0x779/0x7e0
[25586.658036] [<ffffffff8112345d>] ? handle_pte_fault+0x8d/0x850
[25586.658040] [<ffffffff811496e8>] ? kmem_cache_alloc+0x28/0x160
[25586.658041] [<ffffffff8116304d>] ? final_putname+0x1d/0x40
[25586.658043] [<ffffffff81166ae1>] filename_lookup.isra.44+0x21/0x60
[25586.658044] [<ffffffff811673ff>] user_path_at_empty+0x4f/0x90
[25586.658047] [<ffffffff815d772c>] ? __do_page_fault+0x1ec/0x5f0
[25586.658049] [<ffffffff8106a233>] ? lg_local_unlock+0x33/0x60
[25586.658051] [<ffffffff8116744c>] user_path_at+0xc/0x10
[25586.658052] [<ffffffff8115cc9d>] vfs_fstatat+0x4d/0xa0
[25586.658054] [<ffffffff8115d225>] sys_newfstatat+0x15/0x40
[25586.658057] [<ffffffff81300a3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[25586.658059] [<ffffffff815db512>] system_call_fastpath+0x16/0x1b
[25586.658060] 1 lock held by find/34976:
[25586.658062] #0: (&type->i_mutex_dir_key){......}, at: [<ffffffff815cb360>] lookup_slow+0x30/0xa4

===============================
That caused everything on the system to lockup... and caused a reset to
the controller on sda:

[25632.193805] sd 0:2:0:0: [sda] megasas: RESET cmd=8a retries=0
[25632.193812] megasas: [ 0]waiting for 1 commands to complete
[25633.195216] megaraid_sas: no pending cmds after reset
[25633.195222] megasas: reset successful
[25633.204745] smbd (3469) used greatest stack depth: 2848 bytes left
---
Then sys went back to normal.

I just upgraded from 3.9.6 -> 3.9.7 (linux kernel).

Didn't bother to wonder about it until I saw this message.

--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse+owner@xxxxxxxxxxxx

< Previous Next >