[Bug 1007716] New: scsi scan work was stuck forever when reboot storage controller
http://bugzilla.novell.com/show_bug.cgi?id=1007716 Bug ID: 1007716 Summary: scsi scan work was stuck forever when reboot storage controller Classification: openSUSE Product: openSUSE.org Version: unspecified Hardware: x86-64 OS: SLES 11 Status: NEW Severity: Normal Priority: P5 - None Component: 3rd party software Assignee: opensuse-communityscreening@forge.provo.novell.com Reporter: liuteng.liu@huawei.com QA Contact: opensuse-communityscreening@forge.provo.novell.com Found By: --- Blocker: --- My server install suse11sp2, and Emulex HBA card connect to SAN storage. I mapped 100 luns to host server. Once reboot storage controller A leads to scsi scan work stuck with following call traces: kbox: Hung task kworker/u:5:1023 is in D state,more than 120 seconds! kworker/u:5 D 1023 1023 2 Call Trace: [<ffffffff803f7fad>] schedule_timeout+0x21d/0x2c0 [<ffffffff803f6e95>] wait_for_common+0xe5/0x210 [<ffffffff801faf58>] blk_execute_rq+0xb8/0xf0 [<ffffffffa0087ab5>] alua_vpd_inquiry+0xb5/0x3a0 [scsi_dh_alua] [<ffffffffa0087e4e>] alua_initialize+0xae/0x130 [scsi_dh_alua] [<ffffffffa008833e>] alua_bus_attach+0x6e/0x19c [scsi_dh_alua] [<ffffffffa007223a>] scsi_dh_handler_attach+0x2a/0x80 [scsi_dh] [<ffffffff803fde47>] notifier_call_chain+0x37/0x70 [<ffffffff8006e68b>] __blocking_notifier_call_chain+0x5b/0x90 [<ffffffff802cdee2>] device_add+0x2b2/0x4e0 [<ffffffffa000f8a1>] scsi_sysfs_add_sdev+0xb1/0x310 [scsi_mod] [<ffffffffa000c888>] scsi_add_lun+0x518/0x530 [scsi_mod] [<ffffffffa000cd89>] scsi_probe_and_add_lun+0x1b9/0x480 [scsi_mod] [<ffffffffa000d326>] scsi_report_lun_scan+0x2d6/0x440 [scsi_mod] [<ffffffffa000dab6>] __scsi_scan_target+0xf6/0x1f0 [scsi_mod] [<ffffffffa000e0b1>] scsi_scan_target+0xd1/0xf0 [scsi_mod] [<ffffffffa05a712a>] fc_scsi_scan_rport+0xaa/0xb0 [scsi_transport_fc] [<ffffffff80060b78>] process_one_work+0x168/0x350 [<ffffffff8006452a>] worker_thread+0x17a/0x480 [<ffffffff80068126>] kthread+0x96/0xa0 [<ffffffff80402894>] kernel_thread_helper+0x4/0x10 scan_work was stuck at submit_vpd_inquiry(), because it is waiting for low layer to complete evpd inquiry request. But io never complete because request queue was set to QUEUE_FLAG_STOPPED, therefore it just return without endio. And stopped flag was set by fc_remote_port_delete which trigger by lpfc driver when received RSCN event. CPU1 CPU2 -------------------------------------------------------------------- fc_remote_port_add <-- queue scan work fc_scsi_scan_rport scsi_scan_target __scsi_scan_target scsi_add_lun scsi_dh_handler_attach alua_initialize //alua mode alua_vpd_inquiry submit_vpd_inquiry fc_remote_port_delete scsi_target_block device_block scsi_internal_device_block blk_stop_queue queue_flag_set(QUEUE_FLAG_STOPPED, q); blk_execute_rq __blk_run_queue if(unlikely(blk_queue_stopped(q))) return; <<< never complete, wait forever wait_for_completion we are hoping suse could fix this problem. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=1007716 华为 华为 <liuteng.liu@huawei.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P1 - Urgent -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=1007716 http://bugzilla.novell.com/show_bug.cgi?id=1007716#c1 --- Comment #1 from 华为 华为 <liuteng.liu@huawei.com> --- Please replay this report as soon as possible, thanks. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com