https://bugzilla.novell.com/show_bug.cgi?id=818983 https://bugzilla.novell.com/show_bug.cgi?id=818983#c0 Summary: (NetApp CQ238573 & 273502) IPR device bus error during error inject testing Classification: openSUSE Product: openSUSE 11.4 Version: Factory Platform: PowerPC-64 OS/Version: SLES 11 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: jaci@netapp.com QAContact: qa-bugs@suse.de Found By: --- Blocker: --- Created an attachment (id=538400) --> (http://bugzilla.novell.com/attachment.cgi?id=538400) contents of the original bugzilla entered by IBM User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31 At the request of Hanns-Joachim Uhl, I am recreating bugzilla 88434 under NetApp. I have attached the logs and contents of the previous bugzilla. I will provide a brief summary of the problem and analysis from our failover team below. The issue occurs in SLES 10.4 and 11.X on the Power-PC64 platform. Let me know if there is anything I can provide to assist in the issue. - Ian Jackson ============================================
From Babu Moger>
Here is the problem… The scsi is trying to detect the devices, but the hba is returning error(DID_NO_CONNECT).. Jun 15 14:54:27 kswm-iop-bc8-s9 kernel: mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 4, phy 4, sas_addr 0x50080e51b5168000 Jun 15 14:54:27 kswm-iop-bc8-s9 kernel: 2:0:5:0: scsi scan: INQUIRY pass 1 length 36 Jun 15 14:54:27 kswm-iop-bc8-s9 kernel: 2:0:5:0: done SUCCESS 10000 2:0:5:0: Jun 15 14:54:27 kswm-iop-bc8-s9 kernel: command: Inquiry: 12 00 00 00 24 00 Jun 15 14:54:27 kswm-iop-bc8-s9 kernel: scsi scan: INQUIRY failed with code 0x10000 Jun 15 14:54:27 kswm-iop-bc8-s9 kernel: target2:0:5: mptsas: ioc0: delete device: fw_channel 0, fw_id 4, phy 4, sas_addr 0x50080e51b5168000 Jun 15 14:54:27 kswm-iop-bc8-s9 kernel: target2:0:5: mptsas: ioc0: delete device: fw_channel 0, fw_id 4, phy 5, sas_addr 0x50080e51b5168000 Jun 15 14:54:27 kswm-iop-bc8-s9 kernel: target2:0:5: mptsas: ioc0: delete device: fw_channel 0, fw_id 4, phy 6, sas_addr 0x50080e51b5168000 Jun 15 14:54:27 kswm-iop-bc8-s9 kernel: target2:0:5: mptsas: ioc0: delete device: fw_channel 0, fw_id 4, phy 7, sas_addr 0x50080e51b5168000 This looks like a HBA problem.. Command probably has not gone until the target.. Reproducible: Sometimes Steps to Reproduce: 1.Start IO to multiple mapped luns in a multipath environment (MPP or DMMP using the RDAC handler) 2.Fail 1 of 2 paths to the mapped luns receiving IO 3.Unfail the path, wait 10 minutes for the path to the luns to be rediscovered, fail the alternate path. Actual Results: The 1st failed path is not rediscovered by the HBA, so both paths appear to be down causing an IO error. This occurs about 90% of the time. Expected Results: The path should be discovered when it comes back online. When the alternate path is failed, it should still have access to the mapped luns along the primary restored path. It seems I am only allowed to attach one document at a time. I will post follow up comments with the additional logs/documents from the previous bugzilla. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.