[Bug 913152] New: Copying a big amount of data on a SATA HDD causing UDMA CRC errrors (no hardware failure - works fine on older OpenSuse installation)
http://bugzilla.opensuse.org/show_bug.cgi?id=913152 Bug ID: 913152 Summary: Copying a big amount of data on a SATA HDD causing UDMA CRC errrors (no hardware failure - works fine on older OpenSuse installation) Classification: openSUSE Product: openSUSE 13.1 Version: Final Hardware: x86-64 OS: openSUSE 13.1 Status: NEW Severity: Major Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: athu@arcor.de QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Created attachment 619598 --> http://bugzilla.opensuse.org/attachment.cgi?id=619598&action=edit excerpt from /var/log /messages and smartctl -a Summary: while copying big amount of data (e.g. ISO images, partition copy) from one harddisk/ USB-stick to my SATA harddisk, so making constant higher IO, it shows communication errors in /var/log/messages and UDMA CRC errors on the SATA harddisk (via smartctl –a /dev/sdb). During normal workload (desktop PC) this seems not to happen. This test result can be reproduced. - The P-ATA HDD (320 GB) is not affected by this. - Copying big amount of data on the same SATA disk seems not affected by this. - On the same hardware an installation of OpenSuse 11.4 (with kernel 3.0.x) this test case NEVER produces any errors or UMDA CRC errors. Opensue 11.4 evergreen uses exactly the same config as 13.1 (ACPI, NCQ enabled). So this is obviously not a hardware issue. It must be a regression in newer kernels > 3.0 or drivers / modules as OS 11.4 evergreen works fine. There is no such entry in /var/log/messages nor I seen a rasing UDMA CRC error count via smartctl. I used it until November 2014. I found as a workaround that disabling NCQ by "echo 1 > /sys/block/sdb/device/queue_depth" or by Kernel-Option "libata.force=noncq" it works again for me, with a certain performance loss. So this is more a hint what may be broken. But no long term solution. Additional information: * There are several reports of similar issues on the internet. Some of them relate to hardware (e.g. bad S-ATA cable) but still some fits to above scheme. Most have "failed command: WRITE FPDMA QUEUED" some have "READ FPDMA QUEUED" as failed (the latter is not the case for me). * As mentioned above I have OS 13.1 (with this issue) and OS 11.4 evergreen (same test = all OK) on the same system so I can do comparison or tests. att1: excerpt from /var/log/messages att2: smartctl –a in one file HW-Config: Intel Core 2 Duo E6850, MSI P6N SLI with nForce 650 SLI Chipset, 8 Gbyte RAM, WDC WD3200JB 320 GB IDE (driver pata_amd), WDC WD2000FYYZ2TB S-ATA (driver: sata_nv), 2 DVD drives on IDE2, GeForce 660 Ti This board seems not to use AHCI mode for SATA. Installed systems: Opensuse 13.1 32 & 64 bit and Opensuse 11.4 evergreeen (32 bit) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=913152
Alexander Thürmer
http://bugzilla.opensuse.org/show_bug.cgi?id=913152
--- Comment #1 from Alexander Thürmer
http://bugzilla.opensuse.org/show_bug.cgi?id=913152
Alexander Thürmer
http://bugzilla.opensuse.org/show_bug.cgi?id=913152
--- Comment #2 from Alexander Thürmer
http://bugzilla.opensuse.org/show_bug.cgi?id=913152
Alexander Thürmer
participants (1)
-
bugzilla_noreply@novell.com