https://bugzilla.novell.com/show_bug.cgi?id=890296 https://bugzilla.novell.com/show_bug.cgi?id=890296#c0 Summary: System loses access to storage after "iser_cma_handler:Unexpected RDMA CM event (15)" message Classification: openSUSE Product: openSUSE 13.1 Version: RC 1 Platform: x86-64 OS/Version: SLES 12 Status: NEW Severity: Critical Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: garrett.marks@netapp.com QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0 A system using Infiniband with iSER connections to storage will lose access to the storage after the following event is received on the host. This message will be logged in syslog. 2014-08-02T01:56:05.244447-05:00 ictb-yellowstone kernel: [23996.927553] iser: iser_cma_handler:Unexpected RDMA CM event (15) After this I usually see IOs timeout and it appears that the storage connections are hung. This issue was found and reported to Mellanox using the Mellanox OFED. Mellanox has fixed the iSER connection management logic to prevent this issue in their product. However, this issue is present in the code currently used by the SLES 12 inbox OFED/RDMA implementation. Reproducible: Sometimes Steps to Reproduce: 1. Connect a host to NetApp Eseries storage using Infiniband with iSER. 2. Start IO to LUNs mapped to the host. 3. Repeatedly reset controllers on the array allowing time for the controller to come back online before resetting a controller again. This issue will occur regularly but not every time there are path changes. It seems to be dependent on timing. Actual Results: When the Unexpected RDMA CM event (15) event is received the host will lose access to the storage. Expected Results: The host should not receive the Unexpected RDMA CM event (15), or the host should be able to properly handle this event. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.