Mailinglist Archive: opensuse-bugs (9018 mails)

< Previous Next >
[Bug 427264] New: ata problems (exceptions?), I suspect that it sometime even corrupts data on disks
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Thu, 18 Sep 2008 02:25:13 -0600 (MDT)
  • Message-id: <bug-427264-21960@xxxxxxxxxxxxxxxxxxxxxxxxx/>
https://bugzilla.novell.com/show_bug.cgi?id=427264

User rodo@xxxxxxxxxx added comment
https://bugzilla.novell.com/show_bug.cgi?id=427264#c1

Summary: ata problems (exceptions?), I suspect that it sometime
even corrupts data on disks
Product: openSUSE 11.0
Version: Final
Platform: x86-64
OS/Version: Other
Status: NEW
Severity: Normal
Priority: P5 - None
Component: Kernel
AssignedTo: bnc-team-screening@xxxxxxxxxxxxxxxxxxxxxx
ReportedBy: rodo@xxxxxxxxxx
QAContact: qa@xxxxxxx
Found By: ---


I see these in my dmesg output:

ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata3.00: cmd b0/d8:00:01:4f:c2/00:00:00:00:00/00 tag 0
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3: soft resetting link
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: configured for UDMA/133
ata3: EH complete
sd 2:0:0:0: [sde] 976773168 512-byte hardware sectors (500108 MB)
sd 2:0:0:0: [sde] Write Protect is off
sd 2:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support
DPO or FUA
ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata4.00: cmd b0/d8:00:01:4f:c2/00:00:00:00:00/00 tag 0
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4.00: status: { DRDY }
ata4: soft resetting link
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: configured for UDMA/133
ata4: EH complete
sd 3:0:0:0: [sdf] 976773168 512-byte hardware sectors (500108 MB)
sd 3:0:0:0: [sdf] Write Protect is off
sd 3:0:0:0: [sdf] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support
DPO or FUA

/var/log/message shows this:

Sep 18 10:03:22 rychlik smartd[9902]: Device: /dev/sde, opened
Sep 18 10:03:22 rychlik smartd[9902]: Device /dev/sde: using '-d sat' for ATA
disk behind SAT layer.
Sep 18 10:03:22 rychlik smartd[9902]: Device: /dev/sde, opened
Sep 18 10:03:22 rychlik smartd[9902]: Device: /dev/sde, found in smartd
database.
Sep 18 10:03:28 rychlik kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2 frozen
Sep 18 10:03:28 rychlik kernel: ata3.00: cmd
b0/d8:00:01:4f:c2/00:00:00:00:00/00 tag 0
Sep 18 10:03:28 rychlik kernel: res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 18 10:03:28 rychlik kernel: ata3.00: status: { DRDY }
Sep 18 10:03:29 rychlik kernel: ata3: soft resetting link
Sep 18 10:03:30 rychlik kernel: ata3: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Sep 18 10:03:30 rychlik smartd[9902]: Device: /dev/sde, could not enable SMART
capability
Sep 18 10:03:30 rychlik smartd[9902]: Device: /dev/sdf, opened
Sep 18 10:03:30 rychlik kernel: ata3.00: configured for UDMA/133
Sep 18 10:03:30 rychlik kernel: ata3: EH complete
Sep 18 10:03:30 rychlik kernel: sd 2:0:0:0: [sde] 976773168 512-byte hardware
sectors (500108 MB)
Sep 18 10:03:30 rychlik kernel: sd 2:0:0:0: [sde] Write Protect is off
Sep 18 10:03:30 rychlik kernel: sd 2:0:0:0: [sde] Mode Sense: 00 3a 00 00
Sep 18 10:03:30 rychlik kernel: sd 2:0:0:0: [sde] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
Sep 18 10:03:30 rychlik smartd[9902]: Device /dev/sdf: using '-d sat' for ATA
disk behind SAT layer.
Sep 18 10:03:30 rychlik smartd[9902]: Device: /dev/sdf, opened
Sep 18 10:03:30 rychlik smartd[9902]: Device: /dev/sdf, found in smartd
database.
Sep 18 10:03:36 rychlik kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x2 frozen
Sep 18 10:03:36 rychlik kernel: ata4.00: cmd
b0/d8:00:01:4f:c2/00:00:00:00:00/00 tag 0
Sep 18 10:03:36 rychlik kernel: res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 18 10:03:36 rychlik kernel: ata4.00: status: { DRDY }
Sep 18 10:03:38 rychlik kernel: ata4: soft resetting link
Sep 18 10:03:38 rychlik kernel: ata4: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Sep 18 10:03:39 rychlik smartd[9902]: Device: /dev/sdf, could not enable SMART
capability
Sep 18 10:03:39 rychlik smartd[9902]: Monitoring 0 ATA and 4 SCSI devices
Sep 18 10:03:39 rychlik kernel: ata4.00: configured for UDMA/133
Sep 18 10:03:39 rychlik kernel: ata4: EH complete
Sep 18 10:03:39 rychlik kernel: sd 3:0:0:0: [sdf] 976773168 512-byte hardware
sectors (500108 MB)
Sep 18 10:03:39 rychlik kernel: sd 3:0:0:0: [sdf] Write Protect is off
Sep 18 10:03:39 rychlik kernel: sd 3:0:0:0: [sdf] Mode Sense: 00 3a 00 00
Sep 18 10:03:39 rychlik kernel: sd 3:0:0:0: [sdf] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA

This one seems to be triggered by starting smartd when sde and sdf were in
standby mode.

It happens without smartd as well, for example when unpacking few gigabytes of
sources for openoffice, after which I find some C source files containing
random binary data. (I was overclocking my CPU here though at that time. The
logs in this bug report are from not overclocked system though. I will continue
to run without overclocking to see if it still corrupts data as well)

I googled for the problem and saw more people with such problems, but didn't
find any solution. Let me know if you need more info.

rychlik:/home/rodo/svn/ooo-build-reference # uname -a
Linux rychlik 2.6.25.16-0.1-default #1 SMP 2008-08-21 00:34:25 +0200 x86_64
x86_64 x86_64 GNU/Linux

This bug might be also duplicate of #409484 and #359333, but I am not sure
whether it is.

Also one things which might be related is weird behavior of eth0, which shows
lot of dropped packets, rapidly increasing over time:

eth0 Link encap:Ethernet HWaddr 00:E0:4D:96:DD:A0
inet addr:192.168.2.36 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::2e0:4dff:fe96:dda0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:88101 errors:0 dropped:734521700125 overruns:0 frame:0
TX packets:54394 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:116761891 (111.3 Mb) TX bytes:5470246 (5.2 Mb)
Interrupt:251

~30 secs later:

eth0 Link encap:Ethernet HWaddr 00:E0:4D:96:DD:A0
inet addr:192.168.2.36 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::2e0:4dff:fe96:dda0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:88196 errors:0 dropped:738364737052 overruns:0 frame:0
TX packets:54502 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:116803324 (111.3 Mb) TX bytes:5479408 (5.2 Mb)
Interrupt:251


--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

< Previous Next >