https://bugzilla.novell.com/show_bug.cgi?id=651152
https://bugzilla.novell.com/show_bug.cgi?id=651152#c0
Summary: [LSI CR184859] SLES11.1 with a IBM QMI8142 FCoE HBA
crashes when injecting lips using the Fiber channel
Storage array.
Classification: openSUSE
Product: openSUSE 11.1
Version: Final
Platform: PowerPC-64
OS/Version: openSUSE 11.1
Status: NEW
Severity: Normal
Priority: P5 - None
Component: Kernel
AssignedTo: bnc-team-screening@forge.provo.novell.com
ReportedBy: Kendal.Schwerdtfeger@lsi.com
QAContact: qa@suse.de
Found By: ---
Blocker: ---
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12)
Gecko/20101026 Firefox/3.6.12 ( .NET CLR 3.5.30729)
We have an IBM Blade server with SLES11.1 and an IBM QMI8142 (42C1830) FCoE HBA
installed and Device Mapper for the failover driver. The server will crash
during overnight runs of injecting fiber channel lips. Dump file is available.
Crash printout is below:
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
Modules linked in: iptable_filter ip_tables x_tables nfs fscache lockd nfs_acl
auth_rpcgss sunrpc dm_round_robin dm_multipath af_pac
ket ipv6 fuse loop dm_mod qla2xxx qlge ses scsi_transport_fc enclosure ehea(X)
sg scsi_tgt ext3 jbd mbcache usbhid hid ohci_hcd ehci
_hcd usbcore sd_mod crc_t10dif ipr(X) libata scsi_dh_rdac scsi_dh scsi_mod
Supported: Yes
NIP: d000000000b11e00 LR: d000000001ba3f34 CTR: d000000000b11df8
REGS: c00000006ded76f0 TRAP: 0300 Tainted: G X
(2.6.32.12-0.7-ppc64)
MSR: 8000000000009032 CR: 28008084 XER: 00000020
DAR: 0000000100000058, DSISR: 0000000040000000
TASK = c00000006ace70e0[1126] 'qla2xxx_1_dpc' THREAD: c00000006ded4000 CPU: 3
GPR00: d000000001edc08c c00000006ded7970 d000000000b43c18 0000000100000000
GPR04: c00000006da13d00 000000000000ffff 0000000028000084 c0000000000130c4
GPR08: c00000006ded7a40 c0000000e697bdd0 0000000000000000 d000000000b11df8
GPR12: d000000001ba84c8 c000000000f62c00 0000000000002000 0000000000000002
GPR16: 0000000000000020 0000000000010000 0000000000000000 0000000000000800
GPR20: 0000000000000010 0000000000000100 0000000000000004 0000000000000000
GPR24: c00000006ded7d40 c00000006ded7d50 c0000000e672eb80 0000000000000000
GPR28: c00000006ded7d30 c00000006a5ab580 d000000000b41338 0000000100000000
NIP [d000000000b11e00] .scsi_is_host_device+0x8/0x28 [scsi_mod]
LR [d000000001ba3f34] .fc_remote_port_delete+0x3c/0x148 [scsi_transport_fc]
Call Trace:
[c00000006ded7970] [c00000006ded7a00] 0xc00000006ded7a00 (unreliable)
[c00000006ded7a00] [d000000001edc08c] .qla2x00_rport_del+0x8c/0xc0 [qla2xxx]
[c00000006ded7a90] [d000000001edc270] .qla2x00_reg_remote_port+0x40/0x268
[qla2xxx]
[c00000006ded7b50] [d000000001edc540] .qla2x00_update_fcport+0xa8/0x200
[qla2xxx]
[c00000006ded7c30] [d000000001ee3434] .qla2x00_async_login_done+0x17c/0x1e0
[qla2xxx]
[c00000006ded7cc0] [d000000001ed5928] .qla2x00_do_work+0x2a8/0x2c8 [qla2xxx]
[c00000006ded7dd0] [d000000001ed5b04] .qla2x00_do_dpc+0x1bc/0x5b8 [qla2xxx]
[c00000006ded7ed0] [c0000000000ccf4c] .kthread+0xb4/0xc0
[c00000006ded7f90] [c0000000000309fc] .kernel_thread+0x54/0x70
Instruction dump:
4bffffcc 60000000 8063fd6c a0040000 7c630278 7c630034 5463d97e 7c6307b4
4e800020 60000000 fbc1fff0 ebc28008 <e8630058> e81e8000 ebc1fff0 7c630278
scsi(1:3): Async-login complete - iop0=12.
ehea: eth2: Logical port down
ehea: eth2: Physical port up
ehea: External switch port is backup port
scsi(1:4): Async-login complete - iop0=12.
Sending IPI to other cpus...
Interrupt 552 (real) is invalid, disabling it.
Interrupt 552 (real) is invalid, disabling it.
radeonfb 0002:00:01.0: Invalid ROM contents
radeonfb 0002:00:01.0: Invalid ROM contents
doing fast boot
SysRq : Changing Loglevel
Loglevel set to 1
Creating device nodes with udev
mount: devpts already mounted or /dev/pts busy
mount: according to mtab, devpts is already mounted on /dev/pts
Boot logging started on /dev/hvc0(/dev/console) at Thu Oct 21 19:28:17 2010
blogd: can not write to fd 4: Input/output error
[NETWORK] using interface eth2
[NETWORK] using static config based on
ip=135.15.91.79::135.15.88.1:255.255.252.0:kswa-z13r3c1-bc2s7:eth2:none
Waiting for device /dev/disk/by-id/scsi-3500000e0177b1190-part3 to appear: ok
Mounting root /dev/disk/by-id/scsi-3500000e0177b1190-part3
mount -o rw,acl,user_xattr -t ext3 /dev/disk/by-id/scsi-3500000e0177b1190-part3
/root
Saving dump using makedumpfile
-------------------------------------------------------------------------------
Copying data : [100 %]
The dumpfile is saved to /root/var/crash/2010-10-21-19:28/vmcore.
makedumpfile Completed.
-------------------------------------------------------------------------------
Generating README Finished.
Copying System.map Finished.
Copying kernel Finished.
INFO: Cannot find debug information: Unable to find debuginfo file.
Restarting system.
Reproducible: Always
Steps to Reproduce:
1. Connect an IBM DS3950 and DS3500 fiber channel arrays to an IBM Blade
Center FCoE switch 69Y1909.
2. Map 32 luns from each array to the host and start IOs
3. Inject lips on the fiber channel target connections using the storage
array.
Actual Results:
SLES host crashes with: Oops: Kernel access of bad area, sig: 11 [#1]
Expected Results:
IOs run with out any errors.
model_name
QMI8142
fw_version 5.03.05 (8d4)
driver_version 8.03.01.06.11.1-k8
optrom_bios_version 2.09
optrom_fw_version 5.03.05 2260
optrom_fcode_version 3.09
optrom_fw_version 5.03.05 2260
optrom_efi_version 3.33
--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.