[Bug 393675] New: 2.6.25.4-2-default kernel -> ata errors, system freeze
https://bugzilla.novell.com/show_bug.cgi?id=393675 Summary: 2.6.25.4-2-default kernel -> ata errors, system freeze Product: openSUSE 11.0 Version: Factory Platform: i686 OS/Version: openSUSE 11.0 Status: NEW Severity: Major Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: roffermanns@sysgo.com QAContact: qa@suse.de Found By: --- My laptop (Thinkpad T43p) freezes for 10-20 sec. whenever I switch from ac power to battery or vice versa. I see the following message in the kernel log, whenever this happens: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ca/00:08:77:14:79/00:00:00:00:00/e2 tag 0 dma 4096 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: port is slow to respond, please be patient (Status 0xd0) ata1: device not ready (errno=-16), forcing hardreset ata1: soft resetting link ata1.00: configured for UDMA/100 ata1: EH complete sd 0:0:0:0: [sda] 117210240 512-byte hardware sectors (60012 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Bug 384150 has a similar message. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=393675
Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=393675
User teheo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c1
Tejun Heo
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c2
Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c3
--- Comment #3 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c4
--- Comment #4 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c5
--- Comment #5 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c6
--- Comment #6 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User teheo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c7
Tejun Heo
https://bugzilla.novell.com/show_bug.cgi?id=393675
User trenn@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c8
--- Comment #8 from Thomas Renninger
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c9
--- Comment #9 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c10
--- Comment #10 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c11
--- Comment #11 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c12
--- Comment #12 from Dmitri Chubarov
Hmmm... Strange, so it's not a hardware problem. Does running "hdparm -B 255 /dev/sda" make any difference?
How do I run hdparm on a disk holding the root filesystem? When I do hdparm -v -v -B 255 /dev/sda I get the following Input/output error /dev/sda: setting Advanced Power Management level to disabled HDIO_DRIVE_CMD failed: Input/output error IO_support = 0 (default) 16-bit) HDIO_GET_UNMASKINTR failed: Inappropriate ioctl for device HDIO_GET_DMA failed: Inappropriate ioctl for device HDIO_GET_KEEPSETTINGS failed: Inappropriate ioctl for device readonly = 0 (off) readahead = 256 (on) geometry = 14593/255/63, sectors = 234441648, start = 0 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=393675
User teheo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c13
--- Comment #13 from Tejun Heo
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c14
--- Comment #14 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c15
--- Comment #15 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User teheo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c16
--- Comment #16 from Tejun Heo
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c17
--- Comment #17 from Dmitri Chubarov
IRQs can be shared, so usually it should be okay. The overlapping region is SMBus controller, which should be okay too as long as only either one of the two interfaces is used (right Thomas?).
I have tried changing the "PnP OS" switch in the BIOS between "Yes" and "No" and booting with pci=routeirq, disabling the ieee1394 port but the IRQ allocation remains almost the same.
Does unloading all usb related modules make any difference?
It seems, there was no difference. In case I might be missing something, I paste lots of messages from syslog below. # modprobe -r ehci_hcd uhci_hcd usbhid usbcore Jul 31 19:15:21 clust08 kernel: ehci_hcd 0000:00:1d.7: remove, state 4 Jul 31 19:15:21 clust08 kernel: usb usb2: USB disconnect, address 1 Jul 31 19:15:21 clust08 kernel: ehci_hcd 0000:00:1d.7: USB bus 2 deregistered Jul 31 19:15:21 clust08 kernel: uhci_hcd 0000:00:1d.3: remove, state 4 Jul 31 19:15:21 clust08 kernel: usb usb5: USB disconnect, address 1 Jul 31 19:15:21 clust08 kernel: uhci_hcd 0000:00:1d.3: USB bus 5 deregistered Jul 31 19:15:21 clust08 kernel: uhci_hcd 0000:00:1d.2: remove, state 4 Jul 31 19:15:21 clust08 kernel: usb usb4: USB disconnect, address 1 Jul 31 19:15:21 clust08 kernel: uhci_hcd 0000:00:1d.2: USB bus 4 deregistered Jul 31 19:15:21 clust08 kernel: uhci_hcd 0000:00:1d.1: remove, state 4 Jul 31 19:15:21 clust08 kernel: usb usb3: USB disconnect, address 1 Jul 31 19:15:21 clust08 kernel: uhci_hcd 0000:00:1d.1: USB bus 3 deregistered Jul 31 19:15:21 clust08 kernel: uhci_hcd 0000:00:1d.0: remove, state 1 Jul 31 19:15:21 clust08 kernel: usb usb1: USB disconnect, address 1 Jul 31 19:15:21 clust08 kernel: usb 1-1: USB disconnect, address 2 Jul 31 19:15:21 clust08 kernel: usb 1-2: USB disconnect, address 3 Jul 31 19:15:21 clust08 kernel: uhci_hcd 0000:00:1d.0: USB bus 1 deregistered Jul 31 19:15:21 clust08 kernel: usbcore: deregistering interface driver usbhid Jul 31 19:15:21 clust08 kernel: usbcore: deregistering interface driver hiddev Jul 31 19:15:21 clust08 kernel: usbcore: deregistering device driver usb Jul 31 19:15:21 clust08 kernel: usbcore: deregistering interface driver usbfs Jul 31 19:15:21 clust08 kernel: usbcore: deregistering interface driver hub .. Jul 31 19:43:10 clust08 kernel: ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x2 frozen Jul 31 19:43:10 clust08 kernel: ata2.00: cmd 61/09:00:8f:3d:54/00:00:02:00:00/40 tag 0 ncq 4608 out Jul 31 19:43:10 clust08 kernel: res 40/00:00:00:4f:c2/00:00:00:4f:c2/00 Emask 0x4 (time out) Jul 31 19:43:10 clust08 kernel: ata2.00: status: { DRDY } Jul 31 19:43:13 clust08 kernel: ata2: soft resetting link Jul 31 19:43:19 clust08 kernel: ata2: port is slow to respond, please be patient (Status 0xd0) Jul 31 19:43:23 clust08 kernel: ata2: softreset failed (device not ready) Jul 31 19:43:23 clust08 kernel: ata2: hard resetting link Jul 31 19:43:29 clust08 kernel: ata2: port is slow to respond, please be patient (Status 0x80) Jul 31 19:43:33 clust08 kernel: ata2: COMRESET failed (errno=-16) Jul 31 19:43:33 clust08 kernel: ata2: hard resetting link Jul 31 19:43:34 clust08 kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Jul 31 19:43:34 clust08 kernel: ata2.00: configured for UDMA/133 Jul 31 19:43:34 clust08 kernel: ata2: EH complete Jul 31 19:43:34 clust08 kernel: sd 1:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB) Jul 31 19:43:34 clust08 kernel: sd 1:0:0:0: [sda] Write Protect is off Jul 31 19:43:34 clust08 kernel: sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 Jul 31 19:43:34 clust08 kernel: sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=393675
User trenn@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c18
--- Comment #18 from Thomas Renninger
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c19
--- Comment #19 from Dmitri Chubarov
https://bugzilla.novell.com/show_bug.cgi?id=393675
User trenn@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c20
--- Comment #20 from Thomas Renninger
SMBus now is on IRQ 0 ? You could try to move the driver to be sure it's not touched: mv /lib/modules/`uname -r`/kernel/drivers/i2c/busses/i2c-i801.ko /tmp and reboot. Just to be sure...
I will report again when I manage to update the firmware on the machine in question This is a good idea
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=393675
User dchubarov@ict.nsc.ru added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c21
--- Comment #21 from Dmitri Chubarov
SMBus now is on IRQ 0 ? You could try to move the driver to be sure it's not touched: mv /lib/modules/`uname -r`/kernel/drivers/i2c/busses/i2c-i801.ko /tmp and reboot. Just to be sure...
That comes from lspci:
00:1f.3 SMBus [0c05]: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
SMBus Controller [8086:266a] (rev 03)
Subsystem: ASUSTeK Computer Inc. P5GD1-VW Mainboard [1043:80a6]
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
I will report again when I manage to update the firmware on the machine in
question
This is a good idea It keeps coming. I have updated from 1005 BIOS to 1012 yet it is there. About
10 events in 15 hours probably correlated with user activity.
--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=393675
User trenn@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c22
--- Comment #22 from Thomas Renninger
https://bugzilla.novell.com/show_bug.cgi?id=393675
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c23
Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=393675
User rolf.offermanns@gmx.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=393675#c24
Rolf Offermanns
participants (1)
-
bugzilla_noreply@novell.com