Mailinglist Archive: opensuse (888 mails)

< Previous Next >
Re: [opensuse] thicking hard disk problem
2011. január 17. 11:16 napon Dave Howorth <dhoworth@xxxxxxxxxxxxxxxxx> írta:

Peter Nikolic wrote:
On Saturday 15 January 2011 00:23:47 John Andersen wrote:
On 1/14/2011 3:25 PM, Istvan Gabor wrote:
I have opensuse 11.2 with kernel 2.6.31.14-0.4-desktop.
The system has two 160 GB Maxtor hard disks which are linked to a
Silicon Image 3114 PCI SATA (soft) raid controller. They are configured
as RAID1 (mirror) devices and dmraid is set up and works well except
from the symptom below.

The problem is that occasionally one of the disks gives a ticking sound
and this sound becomes frequent when the activity of the disks increase
(eg. when copying from cd to disk).

Has the system ever worked perfectly or has it always shown these
symptoms in this configuration?

In /var/log/messages file there are several lines like these:

Jan 14 23:43:17 linux kernel: [ 4468.814798] ata5.00: exception Emask
0x10 SAct 0x0 SErr 0x10000 action 0xe frozen Jan 14 23:43:17 linux
kernel: [ 4468.814840] ata5: SError: { PHYRdyChg } Jan 14 23:43:17 linux
kernel: [ 4468.814858] ata5.00: cmd c8/00:08:67:05:f4/00:00:00:00:00/e0
tag 0 dma 4096 in Jan 14 23:43:17 linux kernel: [ 4468.814860]
res d0/d0:d0:d0:d0:d0/ff:ff:ff:ff:ff/c0 Emask 0x12 (ATA bus error) Jan
14 23:43:17 linux kernel: [ 4468.814878] ata5.00: status: { Busy } Jan
14 23:43:17 linux kernel: [ 4468.814887] ata5.00: error: { ICRC UNC IDNF
} Jan 14 23:43:17 linux kernel: [ 4468.814904] ata5: hard resetting link

There's an explanation of the error messages at
https://ata.wiki.kernel.org/index.php/Libata_error_messages that might help.

But I have to say that I'm pretty much still as mystified as I was even
after reading that page. Does anybody know of a better ( == more idiot
proof) explanation?

These messages occur several times in the file.
I guess they have to do something with the ticks.

Almost certainly.

I don't know whether it is a disk or a problem with the controller or
your system setup (e.g. power supply). I have a system where the disks
are shown as healthy by smart and pass all manufacturers' tests but show
similar symptoms when attached to that particular system.

What does smart say about your disks?

Run smart -t long <device>
And then after it has finished run smart -a <device> to see what the
result was.

I had one of these a year or two ago on software raid.
Not good. Make sure you have a hot spare in your raid definition.

That sounds very much like a dying drive to me back it up

These comments sound like good advice!

Cheers, Dave


First thank you Dave, Pete and John for your help, second I apologize for the
late
response.

In the meantime I removed the hard disk in question from the system and
inserted it into another computer (not as a RAID device, just as a normal SATA
disk). In that other system
the drive operates with no problem, there are no ticks, copying to the drive
goes with up to 40-60 Mbit/sec without hanging, and there are no kernel error
messages.

So it is/was either a controller problem, or a driver problem I guess.

Should I try another controller with a different chipset?

I will run smart test later, it's too late now.

Thanks again,

Istvan

--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx

< Previous Next >
Follow Ups