[Bug 757434] New: [NetApp 275071]SAS HBA hangs in the DMD phase for unexpected long period of time on SLES 11.2, which causes IO to terminate eventually
https://bugzilla.novell.com/show_bug.cgi?id=757434 https://bugzilla.novell.com/show_bug.cgi?id=757434#c0 Summary: [NetApp 275071]SAS HBA hangs in the DMD phase for unexpected long period of time on SLES 11.2, which causes IO to terminate eventually Classification: openSUSE Product: openSUSE 11.4 Version: RC 1 Platform: x86-64 OS/Version: Linux Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: yuan@netapp.com QAContact: qa-bugs@suse.de Found By: --- Blocker: --- Created an attachment (id=486353) --> (http://bugzilla.novell.com/attachment.cgi?id=486353) Host_Homer_LogMsg User-Agent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.152 Safari/535.19 Description of problem: During the auto Controller Fail Drive Fail (CFDF) testing, after several iterations go by, the SAS 6G HBA hangs in the Device Missing Delay phase for unexpected long period of time. It eventually causes the IO to terminate. This only happens on the hosts that connect to the switch. Note: The following are the steps for each CFDF iteration (1) Place one controller offline (2) Let IO run for 10 min (3) Fail one drive (4) Let IO run for 10 min (5) Place the controller online (6) Let IO run for 10 min (7) Reconstruct the failed drive (8) Let IO run for 10 min The logs from two fabric-hosts will be attached in this bug. The problem on host "marge" starts around "Mar 20 23:07:05". The problem on host "homer" starts around "Mar 21 00:32:49". Config Info: Configuration (hosts x arrays): 4 x 2 Host Information: Host Name: ictm-homer (fabric-connect) Operating System: OS Name: Linux (SuSE ES) OS Version: 11.2-EM64T OS Patch/Release: 3.0.13-0.27-default DMMP: 0.4.9-0.60.1 HBA: SAS LSI 9200-8e BIOS: 07.23.01.00 FIRMWARE: 12.00.00.00 DRIVER: 09.100.00.00 (Inbox) Host Name: ictm-marge (fabric-connect) Operating System: OS Name: Linux (SuSE ES) OS Version: 11.2-EM64T OS Patch/Release: 3.0.13-0.27-default DMMP: 0.4.9-0.60.1 HBA: SAS LSI 9200-8e BIOS: 07.23.01.00 FIRMWARE: 12.00.00.00 DRIVER: 09.100.00.00 (Inbox) Host Name: ictm-bart(direct-connect) Operating System: OS Name: Linux (SuSE ES) OS Version: 11.2-EM64T OS Patch/Release: 3.0.13-0.27-default DMMP: 0.4.9-0.60.1 HBA: SAS LSI 9200-8e BIOS: 07.23.01.00 FIRMWARE: 12.00.00.00 DRIVER: 09.100.00.00 (Inbox) Host Name: ictm-lisa (direct-connect) Operating System: OS Name: Linux (SuSE ES) OS Version: 11.2-EM64T OS Patch/Release: 3.0.13-0.27-default DMMP: 0.4.9-0.60.1 HBA: SAS LSI 3801X (direct-connect) BIOS: 06.36.00.00 FIRMWARE: 1.33.00.00 DRIVER: 4.28.00.00suse (Inbox) Array Information: Array Name: 10.113.128.125-10.113.128.126 Model: 2660 Firmware: 07.83.02.00 Array Name: 10.113.128.127-10.113.128.128 Model: 2660 Firmware: 07.77.34.00 Switch: Module: LSI SAS 6160 Firmware: 13.00.00.00 Reproducible: Always Steps to Reproduce: 1. Build a 4 X 2 config with 4 LSI SAS 9200-8e HBAs and 2 LSI SAS 3801X HBA. Two 6G SAS HBAs go through the switch, one 6G and one 3G HBAs directly connect to the storage array 1 and another 6G and another 3G HBAs directly connect to the storage array 2. 2. Install LifeKeeper cluster. 3. Start IO on all four hosts. 4. Start the CFDF script (The steps of each iteration have been described above.) Actual Results: HBA hangs in the Device Missing Delay phase for unexpected long period of time. Expected Results: The "HBA hung" shouldn't happen. Once the failover timer exhausts, the IO should fail over to the remaining path. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=757434 https://bugzilla.novell.com/show_bug.cgi?id=757434#c1 --- Comment #1 from Yuan Lanier <yuan@netapp.com> 2012-04-16 23:13:53 UTC --- Created an attachment (id=486354) --> (http://bugzilla.novell.com/attachment.cgi?id=486354) Host_Marge_LogMsg -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com