[opensuse-kernel] aacraid driver crash when it shouldn't
Hi there, I'm facing an issue with 3.11 coming from our kernel-stable repos
A intel motherboard + i7 3770 32Gb ram with an adaptec 6805 adapter.
(8 intel ssd 510 series 120 go attached in a raid6 array)
The bundle adaptec + ssd in raid6 has worked during months (with openSUSE 12.1)
without any glitches.
Now this night we received that kind of error
Sep 15 02:06:52 clochette.disney.interne kernel: aacraid: Host adapter abort request (6,0,0,0)
Sep 15 02:06:52 clochette.disney.interne kernel: aacraid: Host adapter abort request (6,0,0,0)
Sep 15 02:06:52 clochette.disney.interne kernel: aacraid: Host adapter reset request. SCSI hang ?
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: [sde] Medium access timeout failure. Offlining disk!
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: Device offlined - not ready after error recovery
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: Device offlined - not ready after error recovery
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: [sde] Unhandled error code
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: [sde]
Sep 15 02:07:07 clochette.disney.interne kernel: Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: [sde] CDB:
Sep 15 02:07:07 clochette.disney.interne kernel: Write(10): 2a 00 02 b7 f4 00 00 00 20 00
Sep 15 02:07:07 clochette.disney.interne kernel: end_request: I/O error, dev sde, sector 45609984
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-4, logical block 5252224
Sep 15 02:07:07 clochette.disney.interne kernel: lost page write due to I/O error on dm-4
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-4, logical block 5252225
Sep 15 02:07:07 clochette.disney.interne kernel: lost page write due to I/O error on dm-4
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-4, logical block 5252226
Sep 15 02:07:07 clochette.disney.interne kernel: lost page write due to I/O error on dm-4
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-4, logical block 5252227
Sep 15 02:07:07 clochette.disney.interne kernel: lost page write due to I/O error on dm-4
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: [sde] Unhandled error code
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: [sde]
Sep 15 02:07:07 clochette.disney.interne kernel: Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: [sde] CDB:
Sep 15 02:07:07 clochette.disney.interne kernel: Write(10): 2a 00 31 3e f6 20 00 00 08 00
Sep 15 02:07:07 clochette.disney.interne kernel: end_request: I/O error, dev sde, sector 826209824
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-3, logical block 853188
Sep 15 02:07:07 clochette.disney.interne kernel: lost page write due to I/O error on dm-3
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: rejecting I/O to offline device
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-3, logical block 2981929
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: rejecting I/O to offline device
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: rejecting I/O to offline device
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-3, logical block 2981929
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: rejecting I/O to offline device
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-3, logical block 45451
Sep 15 02:07:07 clochette.disney.interne kernel: lost page write due to I/O error on dm-3
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: rejecting I/O to offline device
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-3, logical block 785908
Sep 15 02:07:07 clochette.disney.interne kernel: lost page write due to I/O error on dm-3
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: rejecting I/O to offline device
Sep 15 02:07:07 clochette.disney.interne kernel: Buffer I/O error on device dm-3, logical block 810442
Sep 15 02:07:07 clochette.disney.interne kernel: lost page write due to I/O error on dm-3
Sep 15 02:07:07 clochette.disney.interne kernel: sd 6:0:0:0: rejecting I/O to offline device
But after checking kernel changelog I've found this commit for 3.10 series
which should have fix it?
Author: Mahesh Rajashekhara
following up information After downgrading adaptec 6805 from 19109 to 19076 (used previously with 3.1x from 12.1 kernel) We relaunch a number of IO load on the adapter : after a while it crash again with another kind of errors [ 6060.449557] kvm: zapping shadow pages for mmio generation wraparound [ 6418.612835] kvm: zapping shadow pages for mmio generation wraparound [ 7118.156866] aacraid: Host adapter abort request (6,0,0,0) [ 7118.156893] aacraid: Host adapter abort request (6,0,0,0) [ 7118.156913] aacraid: Host adapter abort request (6,0,0,0) [ 7118.156933] aacraid: Host adapter abort request (6,0,0,0) [ 7118.156954] aacraid: Host adapter abort request (6,0,0,0) [ 7118.156989] aacraid: Host adapter reset request. SCSI hang ? [ 7118.157013] AAC: Host adapter BLINK LED 0x5 [ 7118.157090] AAC0: adapter kernel panic'd 5. [ 7417.922695] sd 6:0:0:0: Device offlined - not ready after error recovery [ 7417.922699] sd 6:0:0:0: Device offlined - not ready after error recovery [ 7417.922701] sd 6:0:0:0: Device offlined - not ready after error recovery [ 7417.922703] sd 6:0:0:0: Device offlined - not ready after error recovery [ 7417.922705] sd 6:0:0:0: Device offlined - not ready after error recovery [ 7417.922710] sd 6:0:0:0: [sde] Unhandled error code [ 7417.922712] sd 6:0:0:0: [sde] [ 7417.922713] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK [ 7417.922715] sd 6:0:0:0: [sde] CDB: [ 7417.922716] Write(10): 2a 00 04 b6 9b c8 00 00 08 00 [ 7417.922723] end_request: I/O error, dev sde, sector 79076296 [ 7417.922760] Buffer I/O error on device dm-4, logical block 9435513 [ 7417.922794] lost page write due to I/O error on dm-4 [ 7417.922806] sd 6:0:0:0: rejecting I/O to offline device [ 7417.922836] sd 6:0:0:0: [sde] killing request [ 7417.922843] sd 6:0:0:0: [sde] Unhandled error code [ 7417.922845] sd 6:0:0:0: [sde] [ 7417.922846] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK Signature during boot [ 2.884308] Adaptec aacraid driver 1.2-0[30200]-ms [ 2.884329] aacraid 0000:01:00.0: can't disable ASPM; OS doesn't have ASPM control [ 3.338857] AAC0: kernel 5.2-0[19076] Apr 2 2012 [ 3.338870] AAC0: monitor 5.2-0[19076] [ 3.338872] AAC0: bios 5.2-0[19076] [ 3.338874] AAC0: serial 1B3011B4283 [ 3.338875] AAC0: Non-DASD support enabled. [ 3.338876] AAC0: 64bit support enabled. [ 3.338878] AAC0: 64 Bit DAC enabled [ 3.353116] scsi6 : aacraid [ 3.353380] scsi 6:0:0:0: Direct-Access Adaptec ARRAYR6 V1.0 PQ: 0 ANSI: 2 [ 3.353650] sd 6:0:0:0: [sde] 1404497920 512-byte logical blocks: (719 GB/669 GiB) [ 3.353695] sd 6:0:0:0: [sde] Write Protect is off [ 3.353709] sd 6:0:0:0: [sde] Mode Sense: 06 00 10 00 [ 3.353796] sd 6:0:0:0: [sde] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 3.361933] sde: sde1 [ 3.362263] sd 6:0:0:0: [sde] Attached SCSI removable disk [ 3.369969] scsi 6:1:0:0: Direct-Access INTEL SSDSC2MH120A2 PPG4 PQ: 1 ANSI: 5 [ 3.370438] scsi 6:1:1:0: Direct-Access INTEL SSDSC2MH120A2 PPG4 PQ: 1 ANSI: 5 [ 3.370866] scsi 6:1:2:0: Direct-Access INTEL SSDSC2MH120A2 PPG4 PQ: 1 ANSI: 5 [ 3.371289] scsi 6:1:3:0: Direct-Access INTEL SSDSC2MH120A2 PPG4 PQ: 1 ANSI: 5 [ 3.371746] scsi 6:1:4:0: Direct-Access INTEL SSDSC2MH120A2 PPG4 PQ: 1 ANSI: 5 [ 3.372193] scsi 6:1:5:0: Direct-Access INTEL SSDSC2MH120A2 PPG4 PQ: 1 ANSI: 5 [ 3.372653] scsi 6:1:6:0: Direct-Access INTEL SSDSC2MH120A2 PPG4 PQ: 1 ANSI: 5 [ 3.373107] scsi 6:1:7:0: Direct-Access INTEL SSDSC2MH120A2 PPG4 PQ: 1 ANSI: 5 .... [ 7417.923203] lost page write due to I/O error on dm-4 [ 7417.923213] sd 6:0:0:0: [sde] Unhandled error code [ 7417.923215] sd 6:0:0:0: [sde] [ 7417.923216] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK [ 7417.923217] sd 6:0:0:0: [sde] CDB: [ 7417.923218] Write(10): 2a 00 04 ac e4 38 00 00 70 00 [ 7417.923223] end_request: I/O error, dev sde, sector 78439480 [ 7417.923281] sd 6:0:0:0: [sde] Unhandled error code [ 7417.923283] sd 6:0:0:0: [sde] [ 7417.923284] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK [ 7417.923285] sd 6:0:0:0: [sde] CDB: [ 7417.923286] Read(10): 28 00 40 a5 16 e8 00 00 08 00 [ 7417.923292] end_request: I/O error, dev sde, sector 1084561128 [ 7417.923345] sd 6:0:0:0: [sde] Unhandled error code [ 7417.923346] sd 6:0:0:0: [sde] [ 7417.923347] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK [ 7417.923349] sd 6:0:0:0: [sde] CDB: [ 7417.923349] Read(10): 28 00 32 ad 41 80 00 00 20 00 [ 7417.923354] end_request: I/O error, dev sde, sector 850215296 [ 7417.923404] sd 6:0:0:0: [sde] Unhandled error code [ 7417.923405] sd 6:0:0:0: [sde] [ 7417.923406] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 7417.923408] sd 6:0:0:0: [sde] CDB: [ 7417.923408] Read(10): 28 00 00 36 cf 80 00 00 08 00 [ 7417.923413] end_request: I/O error, dev sde, sector 3592064 -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member GPG KEY : D5C9B751C4653227 irc: tigerfoot -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
participants (1)
-
Bruno Friedmann