https://bugzilla.novell.com/show_bug.cgi?id=651152 https://bugzilla.novell.com/show_bug.cgi?id=651152#c0 Summary: [LSI CR184859] SLES11.1 with a IBM QMI8142 FCoE HBA crashes when injecting lips using the Fiber channel Storage array. Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: PowerPC-64 OS/Version: openSUSE 11.1 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: Kendal.Schwerdtfeger@lsi.com QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12 ( .NET CLR 3.5.30729) We have an IBM Blade server with SLES11.1 and an IBM QMI8142 (42C1830) FCoE HBA installed and Device Mapper for the failover driver. The server will crash during overnight runs of injecting fiber channel lips. Dump file is available. Crash printout is below: Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 NUMA pSeries Modules linked in: iptable_filter ip_tables x_tables nfs fscache lockd nfs_acl auth_rpcgss sunrpc dm_round_robin dm_multipath af_pac ket ipv6 fuse loop dm_mod qla2xxx qlge ses scsi_transport_fc enclosure ehea(X) sg scsi_tgt ext3 jbd mbcache usbhid hid ohci_hcd ehci _hcd usbcore sd_mod crc_t10dif ipr(X) libata scsi_dh_rdac scsi_dh scsi_mod Supported: Yes NIP: d000000000b11e00 LR: d000000001ba3f34 CTR: d000000000b11df8 REGS: c00000006ded76f0 TRAP: 0300 Tainted: G X (2.6.32.12-0.7-ppc64) MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28008084 XER: 00000020 DAR: 0000000100000058, DSISR: 0000000040000000 TASK = c00000006ace70e0[1126] 'qla2xxx_1_dpc' THREAD: c00000006ded4000 CPU: 3 GPR00: d000000001edc08c c00000006ded7970 d000000000b43c18 0000000100000000 GPR04: c00000006da13d00 000000000000ffff 0000000028000084 c0000000000130c4 GPR08: c00000006ded7a40 c0000000e697bdd0 0000000000000000 d000000000b11df8 GPR12: d000000001ba84c8 c000000000f62c00 0000000000002000 0000000000000002 GPR16: 0000000000000020 0000000000010000 0000000000000000 0000000000000800 GPR20: 0000000000000010 0000000000000100 0000000000000004 0000000000000000 GPR24: c00000006ded7d40 c00000006ded7d50 c0000000e672eb80 0000000000000000 GPR28: c00000006ded7d30 c00000006a5ab580 d000000000b41338 0000000100000000 NIP [d000000000b11e00] .scsi_is_host_device+0x8/0x28 [scsi_mod] LR [d000000001ba3f34] .fc_remote_port_delete+0x3c/0x148 [scsi_transport_fc] Call Trace: [c00000006ded7970] [c00000006ded7a00] 0xc00000006ded7a00 (unreliable) [c00000006ded7a00] [d000000001edc08c] .qla2x00_rport_del+0x8c/0xc0 [qla2xxx] [c00000006ded7a90] [d000000001edc270] .qla2x00_reg_remote_port+0x40/0x268 [qla2xxx] [c00000006ded7b50] [d000000001edc540] .qla2x00_update_fcport+0xa8/0x200 [qla2xxx] [c00000006ded7c30] [d000000001ee3434] .qla2x00_async_login_done+0x17c/0x1e0 [qla2xxx] [c00000006ded7cc0] [d000000001ed5928] .qla2x00_do_work+0x2a8/0x2c8 [qla2xxx] [c00000006ded7dd0] [d000000001ed5b04] .qla2x00_do_dpc+0x1bc/0x5b8 [qla2xxx] [c00000006ded7ed0] [c0000000000ccf4c] .kthread+0xb4/0xc0 [c00000006ded7f90] [c0000000000309fc] .kernel_thread+0x54/0x70 Instruction dump: 4bffffcc 60000000 8063fd6c a0040000 7c630278 7c630034 5463d97e 7c6307b4 4e800020 60000000 fbc1fff0 ebc28008 <e8630058> e81e8000 ebc1fff0 7c630278 scsi(1:3): Async-login complete - iop0=12. ehea: eth2: Logical port down ehea: eth2: Physical port up ehea: External switch port is backup port scsi(1:4): Async-login complete - iop0=12. Sending IPI to other cpus... Interrupt 552 (real) is invalid, disabling it. Interrupt 552 (real) is invalid, disabling it. radeonfb 0002:00:01.0: Invalid ROM contents radeonfb 0002:00:01.0: Invalid ROM contents doing fast boot SysRq : Changing Loglevel Loglevel set to 1 Creating device nodes with udev mount: devpts already mounted or /dev/pts busy mount: according to mtab, devpts is already mounted on /dev/pts Boot logging started on /dev/hvc0(/dev/console) at Thu Oct 21 19:28:17 2010 blogd: can not write to fd 4: Input/output error [NETWORK] using interface eth2 [NETWORK] using static config based on ip=135.15.91.79::135.15.88.1:255.255.252.0:kswa-z13r3c1-bc2s7:eth2:none Waiting for device /dev/disk/by-id/scsi-3500000e0177b1190-part3 to appear: ok Mounting root /dev/disk/by-id/scsi-3500000e0177b1190-part3 mount -o rw,acl,user_xattr -t ext3 /dev/disk/by-id/scsi-3500000e0177b1190-part3 /root Saving dump using makedumpfile ------------------------------------------------------------------------------- Copying data : [100 %] The dumpfile is saved to /root/var/crash/2010-10-21-19:28/vmcore. makedumpfile Completed. ------------------------------------------------------------------------------- Generating README Finished. Copying System.map Finished. Copying kernel Finished. INFO: Cannot find debug information: Unable to find debuginfo file. Restarting system. Reproducible: Always Steps to Reproduce: 1. Connect an IBM DS3950 and DS3500 fiber channel arrays to an IBM Blade Center FCoE switch 69Y1909. 2. Map 32 luns from each array to the host and start IOs 3. Inject lips on the fiber channel target connections using the storage array. Actual Results: SLES host crashes with: Oops: Kernel access of bad area, sig: 11 [#1] Expected Results: IOs run with out any errors. model_name QMI8142 fw_version 5.03.05 (8d4) driver_version 8.03.01.06.11.1-k8 optrom_bios_version 2.09 optrom_fw_version 5.03.05 2260 optrom_fcode_version 3.09 optrom_fw_version 5.03.05 2260 optrom_efi_version 3.33 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.