Bug ID | 970249 |
---|---|
Summary | Kernel (evergreen) 3.12.53 + adaptec 6805 sas hba adapter = backtrace |
Classification | openSUSE |
Product | openSUSE 13.1 |
Version | Final |
Hardware | Other |
OS | Other |
Status | NEW |
Severity | Normal |
Priority | P5 - None |
Component | Kernel |
Assignee | kernel-maintainers@forge.provo.novell.com |
Reporter | bruno@ioda-net.ch |
QA Contact | qa-bugs@suse.de |
Found By | --- |
Blocker | --- |
As discussed on the mailing list during the Update Staging process, I'm opening this bug for reference I was using since month the kernel prepared at http://download.opensuse.org/home:/mkubecek:/evergreen-13.1/openSUSE_13.1/ AMD FX8350 + nvidia blob as desktop (high 3d usage / kde 4.x) AMD FX8350 as server with adaptec raid 8805 AMD Opteron(tm) Processor 2431 + adaptec raid 5805 8x 1To nearline raid6 Intel(R) Xeon(R) CPU E5-1620 v2 @ 3.70GHz + LSI MegaRaid 1 (2x400G intel ssd + 2x 2To HSGD) Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz + 8x 2T Seagate raid soft mdadm (web/mail/postgresql/kvm) AMD Athlon(tm) II X2 250 Processor (high throughput firewall) AMD A10-5800K APU with Radeon(tm) HD Graphics + adaptec 5800 with 8x 2To WDC (raid6) + 8x4To Seagate (raid6 soft) + 10Gbps network intel card (heavy io : backup) But on two machines : having an adaptec raid controleur 6805 I'm getting a kernel backtrace see attachement for captured screen. (sorry no remote serial log) On one of them I've this history of kernel used (during a time I was using kernel:standard) before switching back to evergreen. 2014-04-17 22:24:57|kernel-default|3.11.6-4.1|x86_64|root@clochette.disney.interne|openSUSE-13.1-1.10 2014-04-17 23:13:36|kernel-default|3.11.10-7.1|x86_64||updates 2014-04-18 00:32:55|kernel-default|3.14.1-1.1.geafcebd|x86_64|root@clochette|kernel-stable 2014-05-06 18:17:35|kernel-default|3.14.2-1.1.g1474ea5|x86_64||kernel-stable 2014-05-20 18:30:49|kernel-default|3.11.10-11.1|x86_64||updates 2014-05-20 18:35:06|kernel-default|3.14.4-1.1.gbebeb6f|x86_64||kernel-stable 2014-06-06 17:15:46|kernel-default|3.14.4-2.1.g0de0f93|x86_64||kernel-stable 2014-07-01 17:38:11|kernel-default|3.15.2-1.1.gfb7c781|x86_64||kernel-stable 2014-07-01 17:41:17|kernel-default|3.11.10-17.2|x86_64||updates 2014-07-10 18:47:49|kernel-default|3.15.4-1.1.g2b59ae6|x86_64||kernel-stable 2014-07-29 18:28:29|kernel-default|3.15.6-2.1.gedc5ddf|x86_64||kernel-stable 2014-08-01 17:21:02|kernel-default|3.15.7-1.1.g972d9a6|x86_64||kernel-stable 2014-08-13 19:27:38|kernel-default|3.11.10-21.1|x86_64||updates 2014-08-13 19:28:49|kernel-default|3.15.8-2.1.g258e3b0|x86_64||kernel-stable 2014-09-12 17:56:36|kernel-default|3.16.2-1.1.gdcee397|x86_64||kernel-stable 2014-09-30 11:15:49|kernel-default|3.16.3-1.1.gd2bbe7f|x86_64||kernel-stable 2014-10-17 18:07:46|kernel-default|3.17.0-1.1.gc467423|x86_64||kernel-stable 2014-12-03 17:53:49|kernel-default|3.17.4-2.1.g2d23787|x86_64||kernel-stable 2015-01-06 17:54:51|kernel-default|3.18.1-1.1.g5f2f35e|x86_64|root@clochette|kernel-stable 2015-01-20 17:59:23|kernel-default|3.18.2-2.1.g88366a3|x86_64||kernel-stable 2015-02-03 17:44:26|kernel-default|3.18.5-1.1.gf378da4|x86_64||kernel-stable 2015-03-03 17:57:05|kernel-default|3.19.0-4.1.g7f0e735|x86_64||kernel-stable 2015-03-11 18:07:01|kernel-default|3.19.1-2.1.gc0946e9|x86_64||kernel-stable 2015-03-21 09:05:51|kernel-default|3.19.2-1.1.gf2f9797|x86_64||kernel-stable 2015-04-02 17:00:19|kernel-default|3.19.3-1.1.gf10e7fc|x86_64||kernel-stable 2015-04-17 19:04:27|kernel-default|3.19.4-1.1.g74c332b|x86_64||kernel-stable 2015-05-13 18:15:53|kernel-default|4.0.2-1.1.ga425d38|x86_64||kernel-stable 2015-06-02 18:18:37|kernel-default|4.0.4-4.1.gad54361|x86_64||kernel-stable 2015-06-16 18:13:00|kernel-default|4.0.5-2.1.g0e899eb|x86_64||kernel-stable 2015-07-14 17:59:47|kernel-default|4.1.1-2.1.gcac28b3|x86_64||kernel-stable 2015-07-29 13:56:16|kernel-default|4.1.3-5.1.ga0f869c|x86_64||kernel-stable 2015-08-11 11:26:26|kernel-default|4.1.4-1.1.ga37e14f|x86_64||kernel-stable 2015-08-15 10:53:25|kernel-default|4.1.5-2.1.g83fbd4e|x86_64||kernel-stable 2016-02-02 18:25:21|kernel-default|4.4.0-8.1.g9f68b90|x86_64|root@clochette|kernel-stable 2016-02-17 18:45:43|kernel-default|3.12.51-2.1|x86_64|root@clochette|kernel-evergreen 2016-02-17 19:37:15|kernel-default|3.11.10-34.2|x86_64|root@sysresccd|updates 2016-03-01 17:52:16|kernel-default|3.12.53-1.1|x86_64||kernel-evergreen The last high number was 4.4.0, and the first working > 3.11 was 3.14.1 this is how arcconf tools see the controler and system on a pure 3.11.10-34-default -------------------------------------------------------- Controller Version Information 6805 -------------------------------------------------------- BIOS : 5.2-0 (19147) Firmware : 5.2-0 (19147) Driver : 1.2-0 (30200) Boot Flash : 5.2-0 (19147) On another one which has a different controleur but working 3.12.53 -------------------------------------------------------- Controller Version Information 5805 -------------------------------------------------------- BIOS : 5.2-0 (18948) Firmware : 5.2-0 (18948) Driver : 1.2-1 (40709) Boot Flash : 5.2-0 (18948) We saw the driver get an update 1.2-0 to 1.2-1 > I'm not really an expert in this area but it looks like an IRQ is > received and handled before all the device data structures are set up > properly (a pointer which is still null is dereferenced). > The most funky is on the list of system 3 of them share almost every hardware piece same motherboard Asus CROSSHAIR V FORMULA-Z, BIOS 2101 04/17/2014 same ram TridentX - F3-2400C10D-8GTX - G.SKILL DDR3 Memory x4 same cpu AMD FX(tm)-8350 Eight-Core Processor The main differences are one has a 8805 and intel PT1000 + nvidia GeForce GTX 560 (with nvidia blob) (working) And the two failing have a 6805 + Intel 10-Gigabit X540-AT2 + Nvidia GT218 (pci-e 1x) with nouveau As the crash message really involve aacraid, That's how I deducted the 6800 is the culprit in the stack. We updated the firmware on controler to last available at pmc/adaptec And have this running actually Controllers Found: 2 ---------------------------------------------------------------------- Controller Information ---------------------------------------------------------------------- Controller Status : OK Channel Description : SAS/SATA Controller Model : Adaptec 6805 Controller World Wide Name : 50000D1104872180 Controller Alarm : Enabled Temperature : 52 C/ 125 F (Normal) Installed Memory : 512 MB Global task priority : Low Performance Mode : Default/Dynamic Host Bus Type : PCIe Host Bus Speed : 5000 MHz Host Bus Link Width : 8 bit(s)/link(s) Stayawake Period : Disabled Spinup limit internal drives : 0 Spinup limit external drives : 0 Defunct Disk Drive Count : 0 Logical Devices/Failed/Degraded : 1/0/0 NCQ Status : Enabled Statistics Data Collection Mode : Enabled -------------------------------------------------------- RAID Properties -------------------------------------------------------- Copyback : Disabled Automatic Failover : Enabled Background consistency check : Enabled Background Consistency Check Period : 30 -------------------------------------------------------- Controller Version Information -------------------------------------------------------- BIOS : 5.2-0 (19176) Firmware : 5.2-0 (19176) Driver : 1.2-0 (30200) Boot Flash : 5.2-0 (19176) SEEPROM (Load version/ Flash version) : 2/ 8 -------------------------------------------------------- Controller ZMM Information -------------------------------------------------------- Status : ZMM Optimal