Comment # 4 on bug 921494 from
(In reply to Gang He from comment #3)
> Hello Eric,
> 
> I remember that this bug was ever reproduced by you, but it can be fixed via
> increasing the disk volume. Do you still encounter this bug? if yes, I will
> look at it.

I think this issue should be resolved by this patch:

```
commit c33f0785bf292cf1d15f4fbe42869c63e205b21c
Author: Eric Ren <zren@suse.com>
Date:   Fri Sep 30 15:11:32 2016 -0700

    ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()

    The testcase "mmaptruncate" of ocfs2-test deadlocks occasionally.

    In this testcase, we create a 2*CLUSTER_SIZE file and mmap() on it;
    there are 2 process repeatedly performing the following operations
    respectively: one is doing memset(mmaped_addr + 2*CLUSTER_SIZE - 1, 'a',
    1), while the another is playing ftruncate(fd, 2*CLUSTER_SIZE) and then
    ftruncate(fd, CLUSTER_SIZE) again and again.

    This is the backtrace when the deadlock happens:

       __wait_on_bit_lock+0x50/0xa0
       __lock_page+0xb7/0xc0
       ocfs2_write_begin_nolock+0x163f/0x1790 [ocfs2]
       ocfs2_page_mkwrite+0x1c7/0x2a0 [ocfs2]
       do_page_mkwrite+0x66/0xc0
       handle_mm_fault+0x685/0x1350
       __do_page_fault+0x1d8/0x4d0
       trace_do_page_fault+0x37/0xf0
       do_async_page_fault+0x19/0x70
       async_page_fault+0x28/0x30

    In ocfs2_write_begin_nolock(), we first grab the pages and then allocate
    disk space for this write; ocfs2_try_to_free_truncate_log() will be
    called if -ENOSPC is returned; if we're lucky to get enough clusters,
    which is usually the case, we start over again.

    But in ocfs2_free_write_ctxt() the target page isn't unlocked, so we
    will deadlock when trying to grab the target page again.

    Also, -ENOMEM might be returned in ocfs2_grab_pages_for_write().
    Another deadlock will happen in __do_page_mkwrite() if
    ocfs2_page_mkwrite() returns non-VM_FAULT_LOCKED, and along with a
    locked target page.

    These two errors fail on the same path, so fix them by unlocking the
    target page manually before ocfs2_free_write_ctxt().

    Jan Kara helps me clear out the JBD2 part, and suggest the hint for root
    cause.

    Changes since v1:
    1. Also put ENOMEM error case into consideration.

    Link:
http://lkml.kernel.org/r/1474173902-32075-1-git-send-email-zren@suse.com
    Signed-off-by: Eric Ren <zren@suse.com>
    Reviewed-by: He Gang <ghe@suse.com>
    Acked-by: Joseph Qi <joseph.qi@huawei.com>
    Cc: Mark Fasheh <mfasheh@suse.de>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
```

Eric
> 
> 
> Thanks
> Gang


You are receiving this mail because: