[Bug 831920] New: CephFS deadlocks under multi-node POSIX locking workload
https://bugzilla.novell.com/show_bug.cgi?id=831920 https://bugzilla.novell.com/show_bug.cgi?id=831920#c0 Summary: CephFS deadlocks under multi-node POSIX locking workload Classification: openSUSE Product: openSUSE Factory Version: 13.1 Milestone 3 Platform: Other OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: High Availability AssignedTo: lmb@suse.com ReportedBy: ddiss@suse.com QAContact: qa-bugs@suse.de Found By: --- Blocker: --- CTDB's ping_pong workload can be used to test POSIX locking coherency on clustered file systems. When run across multiple CephFS cluster nodes, the test deadlocks. Server setup ------------ 3 Ceph nodes (Hyper-V VMs) 2 OSDs per node - Each OSD working with local 25G XFS mount-point 1 Monitor per node 1 MDS for the entire cluster Client setup ------------ 2 CephFS clients - 3.11.0-rc2-1.g00cdcf9-default kernel 1 CephFS mount per node (at /media) 1 ping_pong invocation per node - ctdb version 2.3-3.1 - # ping_pong /media/lockme 4 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=831920 https://bugzilla.novell.com/show_bug.cgi?id=831920#c1 David Disseldorp <ddiss@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED CC| |samba-maintainers@SuSE.de AssignedTo|lmb@suse.com |ddiss@suse.com --- Comment #1 from David Disseldorp <ddiss@suse.com> 2013-07-29 10:03:58 UTC --- Should have mentioned, all nodes are running ceph version 0.66. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=831920 https://bugzilla.novell.com/show_bug.cgi?id=831920#c2 --- Comment #2 from David Disseldorp <ddiss@suse.com> 2013-07-29 10:04:48 UTC --- I expect http://tracker.ceph.com/issues/2825 is the corresponding upstream bug. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=831920 https://bugzilla.novell.com/show_bug.cgi?id=831920#c3 --- Comment #3 from David Disseldorp <ddiss@suse.com> 2013-07-29 10:08:40 UTC --- Last weeks analysis of Ceph MDS state leading up to the deadlock: bongo gets 0:1 lock: 2013-07-25 13:43:48.357036 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 0, length: 1, client: 4110, pid: 40767, type: 2 2013-07-25 13:43:48.357068 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=1, waiting_locks.size()=1, client_held_lock_counts -- {4110=1} client_waiting_lock_counts -- {4110=1} held_locks -- start: 2, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 bongo releases 2:1 lock: 2013-07-25 13:43:48.359847 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 2, length: 1, client: 4110, pid: 40767, type: 4 2013-07-25 13:43:48.359878 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=2, waiting_locks.size()=1, client_held_lock_counts -- {4110=2} client_waiting_lock_counts -- {4110=1} held_locks -- start: 0, length: 1, client: 4110, pid: 40767, type: 2 start: 2, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 bongo gets 1:1 lock: 2013-07-25 13:43:48.368759 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 1, length: 1, client: 4110, pid: 40767, type: 2 2013-07-25 13:43:48.368792 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=1, waiting_locks.size()=1, client_held_lock_counts -- {4110=1} client_waiting_lock_counts -- {4110=1} held_locks -- start: 0, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 congo attempts to get 0:1 lock: 2013-07-25 13:43:48.369818 7fb2479dc700 4 mds.0.server handle_client_request client_request(client.4116:229807 setfilelockrule 1, type 2, pid 7899, pid_ns 18446612133027018560, start 0, length 1, wait 1 #10000000002) ... 2013-07-25 13:43:48.371236 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 0, length: 1, client: 4116, pid: 7899, type: 2 2013-07-25 13:43:48.371269 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=1, waiting_locks.size()=1, client_held_lock_counts -- {4110=1} client_waiting_lock_counts -- {4110=1} held_locks -- start: 0, length: 2, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 2013-07-25 13:43:48.371310 7fb2479dc700 10 mds.0.server lock attempt on start: 0, length: 1, client: 4116, pid: 7899, type: 2 2013-07-25 13:43:48.371355 7fb2479dc700 10 mds.0.server it failed on this attempt 2013-07-25 13:43:48.371386 7fb2479dc700 10 mds.0.server added to waiting list bongo releases 0:1 lock: 2013-07-25 13:43:48.373597 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 0, length: 1, client: 4110, pid: 40767, type: 4 2013-07-25 13:43:48.373630 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=1, waiting_locks.size()=2, client_held_lock_counts -- {4110=1} client_waiting_lock_counts -- {4110=1,4116=1} held_locks -- start: 0, length: 2, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 1, length: 1, client: 4110, pid: 40704, type: 2 mds recognises waiting lock: 2013-07-25 13:43:48.374757 7fb2479dc700 10 mds.0.server state after lock change: ceph_lock_state_t. held_locks.size()=1, waiting_locks.size()=2, client_held_lock_counts -- {4110=1} client_waiting_lock_counts -- {4110=1,4116=1} held_locks -- start: 1, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 1, length: 1, client: 4110, pid: 40704, type: 2 2013-07-25 13:43:48.374816 7fb2479dc700 7 mds.0.1 mds has 1 queued contexts 2013-07-25 13:43:48.374848 7fb2479dc700 10 mds.0.1 0x2e16880 2013-07-25 13:43:48.374878 7fb2479dc700 10 mds.0.1 finish 0x2e16880 2013-07-25 13:43:48.374908 7fb2479dc700 7 mds.0.server dispatch_client_request client_request(client.4116:229807 setfilelockrule 1, type 2, pid 7899, pid_ns 18446612133027018560, start 0, length 1, wait 1 #10000000002) 2013-07-25 13:43:48.374944 7fb2479dc700 10 mds.0.server rdlock_path_pin_ref request(client.4116:229807 cr=0x2e3db80) #10000000002 mds receives queued 0:1 lock req: 2013-07-25 13:43:48.376211 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 0, length: 1, client: 4116, pid: 7899, type: 2 2013-07-25 13:43:48.376243 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=1, waiting_locks.size()=2, client_held_lock_counts -- {4110=1} client_waiting_lock_counts -- {4110=1,4116=1} held_locks -- start: 1, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 1, length: 1, client: 4110, pid: 40704, type: 2 2013-07-25 13:43:48.376280 7fb2479dc700 10 mds.0.server lock attempt on start: 0, length: 1, client: 4116, pid: 7899, type: 2 Congo should be notified of aquisition 2013-07-25 13:43:48.376317 7fb2479dc700 10 mds.0.server reply_request 0 (Success) client_request(client.4116:229807 setfilelockrule 1, type 2, pid 7899, pid_ns 18446612133027018560, start 0, length 1, wait 1 #10000000002) congo gets 0:1 lock!: 2013-07-25 13:43:48.377761 7fb2479dc700 10 mds.0.server state after lock change: ceph_lock_state_t. held_locks.size()=2, waiting_locks.size()=1, client_held_lock_counts -- {4110=1,4116=1} client_waiting_lock_counts -- {4110=1} held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 1, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 bongo gets 2:1 lock: 2013-07-25 13:43:48.379945 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 2, length: 1, client: 4110, pid: 40767, type: 2 2013-07-25 13:43:48.379985 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=2, waiting_locks.size()=1, client_held_lock_counts -- {4110=1,4116=1} client_waiting_lock_counts -- {4110=1} held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 1, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 congo attempts to get 1:1 lock: 2013-07-25 13:43:48.381024 7fb2479dc700 4 mds.0.server handle_client_request client_request(client.4116:229808 setfilelockrule 1, type 2, pid 7899, pid_ns 18446612133027018560, start 1, length 1, wait 1 #10000000002) ... 2013-07-25 13:43:48.382344 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 1, length: 1, client: 4116, pid: 7899, type: 2 2013-07-25 13:43:48.382374 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=2, waiting_locks.size()=1, client_held_lock_counts -- {4110=1,4116=1} client_waiting_lock_counts -- {4110=1} held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 1, length: 2, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 2013-07-25 13:43:48.382408 7fb2479dc700 10 mds.0.server lock attempt on start: 1, length: 1, client: 4116, pid: 7899, type: 2 2013-07-25 13:43:48.382439 7fb2479dc700 10 mds.0.server it failed on this attempt 2013-07-25 13:43:48.382467 7fb2479dc700 10 mds.0.server added to waiting list congo 1:1 attempt added to wait list: 2013-07-25 13:43:48.383182 7fb2479dc700 10 mds.0.server state after lock change: ceph_lock_state_t. held_locks.size()=2, waiting_locks.size()=2, client_held_lock_counts -- {4110=1,4116=1} client_waiting_lock_counts -- {4110=1,4116=1} held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 1, length: 2, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 start: 1, length: 1, client: 4116, pid: 7899, type: 2 bongo releases 1:1 lock: 2013-07-25 13:43:48.384523 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 1, length: 1, client: 4110, pid: 40767, type: 4 2013-07-25 13:43:48.384553 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=2, waiting_locks.size()=2, client_held_lock_counts -- {4110=1,4116=1} client_waiting_lock_counts -- {4110=1,4116=1} held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 1, length: 2, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 start: 1, length: 1, client: 4116, pid: 7899, type: 2 mds recognises waiting 1:1 lock: 2013-07-25 13:43:48.385506 7fb2479dc700 10 mds.0.server state after lock change: ceph_lock_state_t. held_locks.size()=2, waiting_locks.size()=2, client_held_lock_counts -- {4110=1,4116=1} client_waiting_lock_counts -- {4110=1,4116=1} held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 2, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 start: 1, length: 1, client: 4116, pid: 7899, type: 2 2013-07-25 13:43:48.385543 7fb2479dc700 7 mds.0.1 mds has 1 queued contexts 2013-07-25 13:43:48.385572 7fb2479dc700 10 mds.0.1 0x2e16f80 2013-07-25 13:43:48.385600 7fb2479dc700 10 mds.0.1 finish 0x2e16f80 2013-07-25 13:43:48.385628 7fb2479dc700 7 mds.0.server dispatch_client_request client_request(client.4116:229808 setfilelockrule 1, type 2, pid 7899, pid_ns 18446612133027018560, start 1, length 1, wait 1 #10000000002) mds receives queue 1:1 lock 2013-07-25 13:43:48.386944 7fb2479dc700 0 mds.0.server handle_client_file_setlock: start: 1, length: 1, client: 4116, pid: 7899, type: 2 2013-07-25 13:43:48.386975 7fb2479dc700 10 mds.0.server state prior to lock change: ceph_lock_state_t. held_locks.size()=2, waiting_locks.size()=2, client_held_lock_counts -- {4110=1,4116=1} client_waiting_lock_counts -- {4110=1,4116=1} held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 2, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 start: 1, length: 1, client: 4116, pid: 7899, type: 2 2013-07-25 13:43:48.387010 7fb2479dc700 10 mds.0.server lock attempt on start: 1, length: 1, client: 4116, pid: 7899, type: 2 Congo should be notified of aquisition 2013-07-25 13:43:48.387044 7fb2479dc700 10 mds.0.server reply_request 0 (Success) client_request(client.4116:229808 setfilelockrule 1, type 2, pid 7899, pid_ns 18446612133027018560, start 1, length 1, wait 1 #10000000002) Congo gets lock!: 2013-07-25 13:43:48.388835 7fb2479dc700 10 mds.0.server state after lock change: ceph_lock_state_t. held_locks.size()=2, waiting_locks.size()=2, client_held_lock_counts -- {4110=1,4116=1} client_waiting_lock_counts -- {4110=1,4116=1} held_locks -- start: 0, length: 2, client: 4116, pid: 7899, type: 2 start: 2, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4110, pid: 40704, type: 2 start: 1, length: 1, client: 4116, pid: 7899, type: 2 !!!!!- lock stays in waiting_locks list -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=831920 https://bugzilla.novell.com/show_bug.cgi?id=831920#c4 --- Comment #4 from David Disseldorp <ddiss@suse.com> 2013-07-29 15:01:17 UTC --- Created an attachment (id=549917) --> (http://bugzilla.novell.com/attachment.cgi?id=549917) Proposed fix I've tested this patch locally and found it to prevent deadlock during multi-node ping_pong: mds: remove waiting lock before merging with neighbours CephFS currently deadlocks under CTDB's ping_pong POSIX locking test when run concurrently on multiple nodes. The deadlock is caused by failed removal of a waiting_locks entry when the waiting lock is merged with an existing lock, e.g: Initial MDS state (two clients, same file): held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2 start: 2, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2 Waiting lock entry 4116@1:1 fires: handle_client_file_setlock: start: 1, length: 1, client: 4116, pid: 7899, type: 2 MDS state after lock is obtained: held_locks -- start: 0, length: 2, client: 4116, pid: 7899, type: 2 start: 2, length: 1, client: 4110, pid: 40767, type: 2 waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2 Note that the waiting 4116@1:1 lock entry is merged with the existing 4116@0:1 held lock to become a 4116@0:2 held lock. However, the now handled 4116@1:1 waiting_locks entry remains. When handling a lock request, the MDS calls adjust_locks() to merge the new lock with available neighbours. If the new lock is merged, then the waiting_locks entry is not located in the subsequent remove_waiting() call. This fix ensures that the waiting_locks entry is removed prior to modification during merge. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=831920 https://bugzilla.novell.com/show_bug.cgi?id=831920#c5 David Disseldorp <ddiss@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED --- Comment #5 from David Disseldorp <ddiss@suse.com> 2013-08-27 11:01:49 UTC --- The fix has been reviewed and merged upstream with: commit 476e4902907dfadb3709ba820453299ececf990b Author: David Disseldorp <ddiss@suse.de> Date: Mon Jul 29 17:05:44 2013 +0200 mds: remove waiting lock before merging with neighbours Gregory Farnum also back-ported the change to the Dumpling (v0.67.x) branch. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com