[Bug 462269] New: NBD: deadlock in kernel 2.6.27.7 not present in 2.6.25.18
https://bugzilla.novell.com/show_bug.cgi?id=462269 Summary: NBD: deadlock in kernel 2.6.27.7 not present in 2.6.25.18 Product: openSUSE 11.1 Version: Final Platform: x86-64 OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: jnelson-suse@jamponi.net QAContact: qa@suse.de Found By: --- The behavior of network block device (nbd) on the client side appears to have changed somewhat - with 2.6.25.18 a network block device that disappeared would eventually time out on the client - that does not appear to be working with 2.6.27.7. Steps to reproduce: Starting with 2.6.25.18 (the default up-to-date kernel for opensuse 11.0), I served a local hard disk via nbd-server. Using the same kernel on another machine, I used nbd-client to make use of that hard disk: nbd-client bs=4096 timeout=30 frank 12345 /dev/nbd0 If you start a 'dd' process on the client like this: strace -t -TT -f dd if=/dev/nbd0 of=/dev/null and then "reboot -fn" (or pull the plug or whatever, but *don't* reboot or shutdown or whatever in any sort of graceful fashion) the server then the client gets stuck in a read, *and the read never returns an error, even if the nbd device is disconnected*. I let the last one sit there for over 2 hours. Using nbd-client -d /dev/nbd0 does not kill the read(2), although processes that try to use /dev/nbd0 after "nbd-client -d /dev/nbd0" correct return errors. This is where it gets yucky - if the nbd device is part of a raid, all further operations on /proc/mdstat and on the raid device itself (/dev/mdNN) fail. What's worse, mdadm get stuck /in the kernel/ and cannot be interrupted: Using magic-sysrq 't' this is what mdadm is doing: Dec 22 15:45:08 turnip kernel: mdadm D 0000000000000000 0 14624 14623 Dec 22 15:45:08 turnip kernel: ffff8800022c5bd8 0000000000000086 ffff8800022c5b18 ffffffff80a59980 Dec 22 15:45:08 turnip kernel: ffffffff80a59980 ffffffff80a567f0 ffffffff80a59980 ffffffff80a59980 Dec 22 15:45:08 turnip kernel: ffffffff80a59980 ffffffff80a59980 ffffffff80a59980 ffffffff80a59980 Dec 22 15:45:08 turnip kernel: Call Trace: Dec 22 15:45:08 turnip kernel: [<ffffffff80408bb6>] md_super_wait+0xc6/0xe0 Dec 22 15:45:08 turnip kernel: [<ffffffff8040d5ac>] write_sb_page+0x1a3/0x1d4 Dec 22 15:45:08 turnip kernel: [<ffffffff8040d8d1>] write_page+0x1c/0x103 Dec 22 15:45:08 turnip kernel: [<ffffffff804096a8>] md_update_sb+0x20b/0x2fa Dec 22 15:45:08 turnip kernel: [<ffffffff8040be10>] md_allow_write+0x9f/0xe4 Dec 22 15:45:08 turnip kernel: [<ffffffff8040be68>] get_bitmap_file+0x13/0xe1 Dec 22 15:45:08 turnip kernel: [<ffffffff8040c42e>] md_ioctl+0x4f8/0x897 Dec 22 15:45:08 turnip kernel: [<ffffffff8034bac1>] blkdev_driver_ioctl+0x5d/0x72 Dec 22 15:45:08 turnip kernel: [<ffffffff8034c32e>] blkdev_ioctl+0x1f5/0x217 Dec 22 15:45:08 turnip kernel: [<ffffffff802e20e7>] block_ioctl+0x1b/0x20 Dec 22 15:45:08 turnip kernel: [<ffffffff802c70b9>] vfs_ioctl+0x21/0x6c Dec 22 15:45:08 turnip kernel: [<ffffffff802c7343>] do_vfs_ioctl+0x23f/0x255 Dec 22 15:45:08 turnip kernel: [<ffffffff802c73aa>] sys_ioctl+0x51/0x73 Dec 22 15:45:08 turnip kernel: [<ffffffff8020c37a>] system_call_fastpath+0x16/0x1b Dec 22 15:45:08 turnip kernel: [<00007f6a54b61b57>] 0x7f6a54b61b57 and the files it has open: turnip:~ # ls -la /proc/14624/fd total 0 dr-x------ 2 root root 0 Dec 22 13:13 . dr-xr-xr-x 7 root root 0 Dec 22 13:13 .. lr-x------ 1 root root 64 Dec 22 13:13 0 -> /dev/null l-wx------ 1 root root 64 Dec 22 13:13 1 -> pipe:[446282] l-wx------ 1 root root 64 Dec 22 13:13 2 -> pipe:[7246] lr-x------ 1 root root 64 Dec 22 15:47 3 -> /dev/md11 turnip:~ # If one issues a "sync" command, the "sync" command also goes off into the kernel and never returns, non-interruptible. Processes begin piling up. I worked with Neil Brown in private and he determined that this is not an md bug. This is changed behavior (REGRESSION) from 2.6.25.18 (openSUSE 11.0). I also filed a very similar kernel bug here: http://bugzilla.kernel.org/show_bug.cgi?id=12277 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462269
Marcus Meissner
https://bugzilla.novell.com/show_bug.cgi?id=462269
User jnelson-suse@jamponi.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c1
--- Comment #1 from Jon Nelson
https://bugzilla.novell.com/show_bug.cgi?id=462269
User pavel@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c2
Pavel Machek
https://bugzilla.novell.com/show_bug.cgi?id=462269
Jiri Kosina
https://bugzilla.novell.com/show_bug.cgi?id=462269
User pavel@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c3
Pavel Machek
https://bugzilla.novell.com/show_bug.cgi?id=462269
User jnelson-suse@jamponi.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c4
--- Comment #4 from Jon Nelson
https://bugzilla.novell.com/show_bug.cgi?id=462269
User pavel@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c5
--- Comment #5 from Pavel Machek
https://bugzilla.novell.com/show_bug.cgi?id=462269
User devzero@web.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c6
roland kletzing
https://bugzilla.novell.com/show_bug.cgi?id=462269
User paul.clements@steeleye.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c7
--- Comment #7 from Paul Clements
https://bugzilla.novell.com/show_bug.cgi?id=462269
User paul.clements@steeleye.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c8
Paul Clements
https://bugzilla.novell.com/show_bug.cgi?id=462269
User jnelson-suse@jamponi.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c9
--- Comment #9 from Jon Nelson
https://bugzilla.novell.com/show_bug.cgi?id=462269
User jnelson-suse@jamponi.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c10
--- Comment #10 from Jon Nelson
https://bugzilla.novell.com/show_bug.cgi?id=462269
User paul.clements@steeleye.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c11
--- Comment #11 from Paul Clements
https://bugzilla.novell.com/show_bug.cgi?id=462269
User jnelson-suse@jamponi.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c12
--- Comment #12 from Jon Nelson
https://bugzilla.novell.com/show_bug.cgi?id=462269
User paul.clements@steeleye.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c13
--- Comment #13 from Paul Clements
https://bugzilla.novell.com/show_bug.cgi?id=462269
User jnelson-suse@jamponi.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c14
--- Comment #14 from Jon Nelson
https://bugzilla.novell.com/show_bug.cgi?id=462269
User paul.clements@steeleye.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c15
--- Comment #15 from Paul Clements
https://bugzilla.novell.com/show_bug.cgi?id=462269
User pavel@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c16
--- Comment #16 from Pavel Machek
https://bugzilla.novell.com/show_bug.cgi?id=462269
Paul Clements
https://bugzilla.novell.com/show_bug.cgi?id=462269
User gregkh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c17
Greg Kroah-Hartman
https://bugzilla.novell.com/show_bug.cgi?id=462269
User pavel@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c18
Pavel Machek
https://bugzilla.novell.com/show_bug.cgi?id=462269
User paul.clements@steeleye.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c19
--- Comment #19 from Paul Clements
https://bugzilla.novell.com/show_bug.cgi?id=462269
User gregkh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=462269#c20
Greg Kroah-Hartman
participants (1)
-
bugzilla_noreply@novell.com