[Bug 995260] New: [4.8.0-rc3-1.g0397e6f-vanilla] "mmap_truncate" testcase hung forever
http://bugzilla.suse.com/show_bug.cgi?id=995260 Bug ID: 995260 Summary: [4.8.0-rc3-1.g0397e6f-vanilla] "mmap_truncate" testcase hung forever Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: High Availability Assignee: lmb@suse.com Reporter: zren@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Created attachment 689212 --> http://bugzilla.suse.com/attachment.cgi?id=689212&action=edit hb_report This looks like a deadlock in kernel. 1. subcase " 2016-08-23 20:42:37 mmap_truncate 4096 131072 sparse,unwritten,inline-data data=ordered 2. stack " ocfs2dev1:/usr/local/ocfs2-test/log/2016-08-23_18:50 # cat /proc/31789/stack [<ffffffff811997d7>] __lock_page+0xb7/0xc0 [<ffffffffa041ef4f>] ocfs2_write_begin_nolock+0x163f/0x1790 [ocfs2] [<ffffffffa0445b47>] ocfs2_page_mkwrite+0x1c7/0x2a0 [ocfs2] [<ffffffff811cee76>] do_page_mkwrite+0x66/0xc0 [<ffffffff811d3225>] handle_mm_fault+0x685/0x1350 [<ffffffff810694c8>] __do_page_fault+0x1d8/0x4d0 [<ffffffff81069827>] trace_do_page_fault+0x37/0xf0 [<ffffffff81061e69>] do_async_page_fault+0x19/0x70 [<ffffffff8170aad8>] async_page_fault+0x28/0x30 [<ffffffffffffffff>] 0xffffffffffffffff ocfs2dev1:/usr/local/ocfs2-test/log/2016-08-23_18:50 # cat /proc/31790/stack [<ffffffff813cc1b7>] call_rwsem_down_write_failed+0x17/0x30 [<ffffffffa04353f9>] ocfs2_truncate_file+0x129/0x740 [ocfs2] [<ffffffffa0437f9a>] ocfs2_setattr+0x6ba/0xb00 [ocfs2] [<ffffffff81244dfb>] notify_change+0x2bb/0x420 [<ffffffff8122440d>] do_truncate+0x5d/0x90 [<ffffffff81224737>] do_sys_ftruncate.constprop.10+0x117/0x120 [<ffffffff8122477e>] SyS_ftruncate+0xe/0x10 [<ffffffff81709676>] entry_SYSCALL_64_fastpath+0x1e/0xa8 [<ffffffffffffffff>] 0xffffffffffffffff 3. bt " Aug 23 20:59:09 ocfs2dev1 kernel: ffff88013fc0e060 ffff880105da8000 ffff880105da7db8 ffffffff810c4aa0 Aug 23 20:59:09 ocfs2dev1 kernel: ffff880103e92000 ffff880105da7da0 ffff880105da7cd0 ffffffff81704cac Aug 23 20:59:09 ocfs2dev1 kernel: Call Trace: Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810c4aa0>] ? wake_atomic_t_function+0x60/0x60 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81704cac>] schedule+0x3c/0x90 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff813053f0>] jbd2_journal_commit_transaction+0x250/0x1920 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810ba723>] ? put_prev_entity+0x33/0x3e0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810c4aa0>] ? wake_atomic_t_function+0x60/0x60 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810f0f90>] ? lock_timer_base.isra.24+0x80/0xa0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810f105f>] ? try_to_del_timer_sync+0x4f/0x70 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff8130af0b>] kjournald2+0xbb/0x240 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810c4aa0>] ? wake_atomic_t_function+0x60/0x60 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff8130ae50>] ? commit_timeout+0x10/0x10 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810a16c9>] kthread+0xc9/0xe0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff8170989f>] ret_from_fork+0x1f/0x40 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810a1600>] ? kthread_worker_fn+0x170/0x170 Aug 23 20:59:09 ocfs2dev1 kernel: INFO: task mmap_truncate:31789 blocked for more than 480 seconds. Aug 23 20:59:09 ocfs2dev1 kernel: Not tainted 4.8.0-rc3-1.g0397e6f-vanilla #1 Aug 23 20:59:09 ocfs2dev1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 23 20:59:09 ocfs2dev1 kernel: mmap_truncate D ffff880135617a28 0 31789 30069 0x00000000 Aug 23 20:59:09 ocfs2dev1 kernel: ffff880135617a28 00ff880135617ad8 ffff88013a444040 ffff880104ac4240 Aug 23 20:59:09 ocfs2dev1 kernel: ffff88013926f408 ffff880135618000 ffff88013fd19680 7fffffffffffffff Aug 23 20:59:09 ocfs2dev1 kernel: ffffffff81705510 ffff880135617b80 ffff880135617a40 ffffffff81704cac Aug 23 20:59:09 ocfs2dev1 kernel: Call Trace: Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81705510>] ? bit_wait+0x60/0x60 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81704cac>] schedule+0x3c/0x90 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81708170>] schedule_timeout+0x2e0/0x470 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81062195>] ? kvm_clock_read+0x25/0x40 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810621b9>] ? kvm_clock_get_cycles+0x9/0x10 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810f96d1>] ? ktime_get+0x41/0xb0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81705510>] ? bit_wait+0x60/0x60 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81704464>] io_schedule_timeout+0xa4/0x110 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff8170552b>] bit_wait_io+0x1b/0x70 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81705310>] __wait_on_bit_lock+0x50/0xa0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff811997d7>] __lock_page+0xb7/0xc0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810c4ae0>] ? autoremove_wake_function+0x40/0x40 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffffa041ef4f>] ocfs2_write_begin_nolock+0x163f/0x1790 [ocfs2] Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffffa0440a50>] ? ocfs2_allocate_extend_trans+0x180/0x180 [ocfs2] Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffffa0445b47>] ocfs2_page_mkwrite+0x1c7/0x2a0 [ocfs2] Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff811cee76>] do_page_mkwrite+0x66/0xc0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff811d3225>] handle_mm_fault+0x685/0x1350 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81039dc0>] ? __fpu__restore_sig+0x70/0x530 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff810694c8>] __do_page_fault+0x1d8/0x4d0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81069827>] trace_do_page_fault+0x37/0xf0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81061e69>] do_async_page_fault+0x19/0x70 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff8170aad8>] async_page_fault+0x28/0x30 Aug 23 20:59:09 ocfs2dev1 kernel: INFO: task mmap_truncate:31790 blocked for more than 480 seconds. Aug 23 20:59:09 ocfs2dev1 kernel: Not tainted 4.8.0-rc3-1.g0397e6f-vanilla #1 Aug 23 20:59:09 ocfs2dev1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 23 20:59:09 ocfs2dev1 kernel: mmap_truncate D ffff880103057c08 0 31790 31789 0x00000000 Aug 23 20:59:09 ocfs2dev1 kernel: ffff880103057c08 00ff88013658fc98 ffff88013a43c000 ffff88010bde8000 Aug 23 20:59:09 ocfs2dev1 kernel: 0000000000000001 ffff880103058000 ffff880103ee8398 ffff880103ee83b0 Aug 23 20:59:09 ocfs2dev1 kernel: ffff880103ee8398 ffff880103ee83a0 ffff880103057c20 ffffffff81704cac Aug 23 20:59:09 ocfs2dev1 kernel: Call Trace: Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81704cac>] schedule+0x3c/0x90 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81707c63>] rwsem_down_write_failed+0x1e3/0x3a0 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffffa0420e52>] ? ocfs2_read_blocks+0x272/0x620 [ocfs2] Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff813cc1b7>] call_rwsem_down_write_failed+0x17/0x30 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81706f04>] down_write+0x24/0x40 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffffa04353f9>] ocfs2_truncate_file+0x129/0x740 [ocfs2] Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffffa0437f9a>] ocfs2_setattr+0x6ba/0xb00 [ocfs2] Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffffa048bdd6>] ? ocfs2_xattr_get+0x96/0x100 [ocfs2] Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff8124d486>] ? generic_getxattr+0x56/0x70 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81244dfb>] notify_change+0x2bb/0x420 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff8122440d>] do_truncate+0x5d/0x90 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81224737>] do_sys_ftruncate.constprop.10+0x117/0x120 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff8122477e>] SyS_ftruncate+0xe/0x10 Aug 23 20:59:09 ocfs2dev1 kernel: [<ffffffff81709676>] entry_SYSCALL_64_fastpath+0x1e/0xa8 4. hb_report attached -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=995260
zhen ren
http://bugzilla.suse.com/show_bug.cgi?id=995260
http://bugzilla.suse.com/show_bug.cgi?id=995260#c2
--- Comment #2 from zhen ren
http://bugzilla.suse.com/show_bug.cgi?id=995260
zhen ren
http://bugzilla.suse.com/show_bug.cgi?id=995260
http://bugzilla.suse.com/show_bug.cgi?id=995260#c4
--- Comment #4 from zhen ren
http://bugzilla.suse.com/show_bug.cgi?id=995260
http://bugzilla.suse.com/show_bug.cgi?id=995260#c5
--- Comment #5 from Jan Kara
http://bugzilla.suse.com/show_bug.cgi?id=995260
http://bugzilla.suse.com/show_bug.cgi?id=995260#c6
--- Comment #6 from zhen ren
Hum, I don't think you want to return VM_FAULT_LOCKED when the allocation failed due to ENOSPC. VM_FAULT_LOCKED means that the fault has succeeded but that is not the case. You rather want to unlock the page manually and return VM_FAULT_SIGBUS or something like that.
Yes! I don't want to cheat VM code, otherwise we will result in another big trouble. Thanks for review! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=995260
http://bugzilla.suse.com/show_bug.cgi?id=995260#c7
--- Comment #7 from zhen ren
http://bugzilla.suse.com/show_bug.cgi?id=995260
http://bugzilla.suse.com/show_bug.cgi?id=995260#c8
zhen ren
participants (1)
-
bugzilla_noreply@novell.com