[Bug 598520] New: ahci driver fails to initialize controller on Dell Optiplex 960
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c0 Summary: ahci driver fails to initialize controller on Dell Optiplex 960 Classification: openSUSE Product: openSUSE 11.3 Version: Milestone 5 Platform: x86-64 OS/Version: Other Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: teheo@novell.com ReportedBy: max@novell.com QAContact: qa@suse.de Found By: Development Blocker: --- Reinhard Max <max@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flag| |SHIP_STOPPER? # dmesg [ 68.614136] ahci 0000:00:1f.2: version 3.0 [ 68.614146] ahci 0000:00:1f.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 [ 68.614172] ahci 0000:00:1f.2: irq 33 for MSI/MSI-X [ 68.614179] ahci 0000:00:1f.2: forcing PORTS_IMPL to 0x1 [ 69.620014] ahci 0000:00:1f.2: controller reset failed (0x80000001) [ 69.620027] ahci 0000:00:1f.2: PCI INT C disabled [ 69.620031] ahci: probe of 0000:00:1f.2 failed with error -5 # lspci -n -vv -s 00:1f.2 00:1f.2 0104: 8086:2822 (rev 02) Subsystem: 1028:0276 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin C routed to IRQ 32 Region 0: I/O ports at fe00 [size=8] Region 1: I/O ports at fe10 [size=4] Region 2: I/O ports at fe20 [size=8] Region 3: I/O ports at fe30 [size=4] Region 4: I/O ports at fec0 [size=32] Region 5: Memory at ff970000 (32-bit, non-prefetchable) [size=2K] Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit- Address: fee0300c Data: 41a9 Capabilities: [70] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [a8] SATA HBA <?> Capabilities: [b0] PCI Advanced Features AFCap: TP+ FLR+ AFCtrl: FLR- AFStatus: TP- Kernel driver in use: ahci This was still working on 11.2, where the dmesg output looked like this: [ 4.047701] ahci 0000:00:1f.2: version 3.0 [ 4.047711] alloc irq_desc for 18 on node 0 [ 4.047712] alloc kstat_irqs on node 0 [ 4.047716] ahci 0000:00:1f.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 [ 4.047737] alloc irq_desc for 32 on node 0 [ 4.047738] alloc kstat_irqs on node 0 [ 4.047744] ahci 0000:00:1f.2: irq 32 for MSI/MSI-X [ 4.047811] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x27 impl RAID mode [ 4.047814] ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio ems [ 4.047817] ahci 0000:00:1f.2: setting latency timer to 64 [ 4.053053] scsi0 : ahci [ 4.053143] scsi1 : ahci [ 4.053197] scsi2 : ahci [ 4.053242] scsi3 : ahci [ 4.053282] scsi4 : ahci [ 4.053321] scsi5 : ahci -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c1 Tejun Heo <teheo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |max@novell.com --- Comment #1 from Tejun Heo <teheo@novell.com> 2010-04-22 10:20:37 UTC --- Hmm... 11.2 still works, right? This is very weird. It looks like the driver is having problem talking to the controller. The "forcing PORTS_IMPL to 0x1" message means that PORT_IMPL doesn't agree with HOST_CAP. No recent intel chips show such problems and on 11.2 the two registers agree. Then, controller fails to clear reset after 1 sec. It looks like a problem in lower layer, most likely PCI. The driver doesn't seem to be talking with the ahci controller at all. Can you please post full kernel log? Also, please attach full output of "lspci -nnv". -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c2 Reinhard Max <max@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|max@novell.com | --- Comment #2 from Reinhard Max <max@novell.com> 2010-04-22 13:09:13 CEST --- Created an attachment (id=356197) --> (http://bugzilla.novell.com/attachment.cgi?id=356197) /var/log/boot.msg after booting into the Milestone5 install system BTW, hare wondered about the fact that the controller uses PCI INT C rather than A. (In reply to comment #1)
Hmm... 11.2 still works, right?
Yes, even 11.3 Milestone4 still works, but there the output of the driver already looks a bit suspicious: [ 2.738945] ahci 0000:00:1f.2: version 3.0 [ 2.738957] ahci 0000:00:1f.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 [ 2.738991] alloc irq_desc for 32 on node -1 [ 2.738993] alloc kstat_irqs on node -1 [ 2.739005] ahci 0000:00:1f.2: irq 32 for MSI/MSI-X [ 2.739055] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x27 impl SATA mode [ 2.739057] ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio ems sxs [ 2.739060] ahci 0000:00:1f.2: setting latency timer to 64 -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c3 --- Comment #3 from Reinhard Max <max@novell.com> 2010-04-22 13:19:18 CEST --- Created an attachment (id=356199) --> (http://bugzilla.novell.com/attachment.cgi?id=356199) lspci output taken from the Milestone5 install system -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c4 Tejun Heo <teheo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |max@novell.com --- Comment #4 from Tejun Heo <teheo@novell.com> 2010-04-22 11:39:44 UTC --- Hmm... the log from m4 seems fine as far as ahci is concerned and I don't think there has been any relevant ahci changes which could have caused something like this. As for using INT C, that's probably because the pci device hosts several functions and each function is assigned to different INTx. Can you please also attach boot.msg from m4? Thanks. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c5 Reinhard Max <max@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|max@novell.com | --- Comment #5 from Reinhard Max <max@novell.com> 2010-04-22 15:11:13 CEST --- Created an attachment (id=356228) --> (http://bugzilla.novell.com/attachment.cgi?id=356228) /var/log/boot.msg from an installed Milestone4 (In reply to comment #4)
Hmm... the log from m4 seems fine as far as ahci is concerned
So, the "node -1" appearing twice in there is OK? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c6 --- Comment #6 from Tejun Heo <teheo@novell.com> 2010-04-22 14:29:09 UTC --- I have no idea about the node -1 from irq subsys (it probably indicates that the irq doesn't have preferred NUMA node associated). At any rate, I don't think it's likely to be related with the failure seen on m5. Hmm... there has been a lot of differences in PCI subsystem. I think it's very likely that the problem is coming from PCI. The bridges don't seem to be being set up correctly. <3>pci 0000:00:1f.2: no compatible bridge window for [mem 0xff970000-0xff9707ff] <4>Expanded resource reserved due to conflict with PCI Bus 0000:00 <7>reserve RAM buffer: 000000000009e400 - 000000000009ffff <7>reserve RAM buffer: 00000000cfdffc00 - 00000000cfffffff I think it probably is something which should be forwarded upstream. Who should I be bugging for PCI issues? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c7 Reinhard Max <max@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |gregkh@novell.com --- Comment #7 from Reinhard Max <max@novell.com> 2010-04-22 16:47:37 CEST --- (In reply to comment #6)
Who should I be bugging for PCI issues?
Maybe Greg can tell us. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c8 Greg Kroah-Hartman <gregkh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW CC| |gregkh@novell.com InfoProvider|gregkh@novell.com | --- Comment #8 from Greg Kroah-Hartman <gregkh@novell.com> 2010-04-22 15:09:48 UTC --- Send the report to the linux-pci@vger.kernel.org mailing list, I'm no longer the upstream PCI maintainer for over a year now. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c11 Reinhard Max <max@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|max@novell.com | --- Comment #11 from Reinhard Max <max@novell.com> 2010-04-28 13:01:07 CEST --- Created an attachment (id=357357) --> (http://bugzilla.novell.com/attachment.cgi?id=357357) Screenshot from booting 2.6.34-rc5-6.99.5.57551bd-desktop KOTD installed into Milstone4 still fails, but with slightly different numbers and one additional line in the ahci messages. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c12 --- Comment #12 from Tejun Heo <teheo@novell.com> 2010-04-28 14:05:48 UTC --- Hmm... yeah slightly different but looks basically like the same failure. I'll report upstream. Thanks for testing. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c13 --- Comment #13 from Bjorn Helgaas <bjorn.helgaas@hp.com> 2010-04-28 15:53:19 UTC --- Created an attachment (id=357467) --> (http://bugzilla.novell.com/attachment.cgi?id=357467) proposed fix - avoid PCI allocations below 1MB This is the same problem as https://bugzilla.kernel.org/show_bug.cgi?id=15744 The patch I'm attaching is in Jesse's for-linus tree and is planned for inclusion in 2.6.34. Please let me know if this patch doesn't fix the problem. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c14 --- Comment #14 from Reinhard Max <max@novell.com> 2010-05-10 17:04:35 CEST --- The latest kernel from Factory (2.6.34-rc6-7-desktop) seems to include this patch already and initializes the controller just fine when installed into Milestone 4. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c15 Stephan Kulow <coolo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO CC| |coolo@novell.com InfoProvider| |max@novell.com --- Comment #15 from Stephan Kulow <coolo@novell.com> 2010-06-17 11:16:31 CEST --- so fixed? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=598520 http://bugzilla.novell.com/show_bug.cgi?id=598520#c16 Reinhard Max <max@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED InfoProvider|max@novell.com | Resolution| |FIXED --- Comment #16 from Reinhard Max <max@novell.com> 2010-06-18 17:42:38 CEST --- Yes, Milestone7 works fine on the Optiplex 960. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com