IDE problem - system locks and requires hard reset
Hi all I am (still) having problems with my system locking up randomly and requiring a hard reset (or even a power off for several seconds). This sometimes does not happen for a few days but sometimes does it every few minutes for a while. The system stops responding - even ctrl-alt-F2/del does not do anything. ctrl-alt-esc returns a corrupt graphics screen and nothing else happens. A hard reset usually works but sometimes the BIOS cannot recognize the IDE0 boot record and I have to power off the box for 20 seconds or so. I am using SuSE 8.1 with the latest athlon kernel (SuSE patched rpm version - k_athlon-2.4.19-167 I think). I have the KT333 VIA chipset with the 8235 southbridge, which has some issues, but that should have been addressed by the patched kernel. Examining /var/log/messages the kernel issues (sometimes, I guess if it has time before it cannot write to the disk at all) errors. Below is a typical set: Mar 8 10:36:59 system001 kernel: hda: status error: status=0x51 { DriveReady SeekComplete Error } Mar 8 10:36:59 system001 kernel: hda: status error: error=0x04 { DriveStatusError } Mar 8 10:36:59 system001 kernel: hda: no DRQ after issuing MULTWRITE Mar 8 10:36:59 system001 kernel: hda: status error: status=0x51 { DriveReady SeekComplete Error } Mar 8 10:36:59 system001 kernel: hda: status error: error=0x04 { DriveStatusError } Mar 8 10:36:59 system001 kernel: hda: no DRQ after issuing MULTWRITE Mar 8 10:36:59 system001 kernel: hda: status error: status=0x51 { DriveReady SeekComplete Error } Mar 8 10:36:59 system001 kernel: hda: status error: error=0x04 { DriveStatusError } Mar 8 10:36:59 system001 kernel: hda: no DRQ after issuing MULTWRITE Mar 8 10:36:59 system001 kernel: hda: status error: status=0x51 { DriveReady SeekComplete Error } Mar 8 10:36:59 system001 kernel: hda: status error: error=0x04 { DriveStatusError } Mar 8 10:36:59 system001 kernel: hda: no DRQ after issuing WRITE Mar 8 10:36:59 system001 kernel: ide0: reset: success After the ide0 reset is when the system totally locks I have tried a different HD and the same thing occurs. The system is fine under Windows (using a different HD in a caddy - and I have tried swapping the caddies too!). Does anyone have any ideas? I have turned off ACPI APIC APM - anything else I should try turning off? I have the DMA mode to "force off" in Yast2. Yours -- Ray
I had something similar which bugged me for months, but involved not the ide hd's, but the cdrom and cdrw on the second ide channel (hdc and hdd). I have a Pioneer cdrom drive (hdc) as master and an Aopen cdrw (hdd) as slave. Ide-scsi is used for the cdrw. Everything would work fine for some time, and then I would hear the cdrom drive being accessed for no reason, and have a whole bunch of errors very similar to what you reported. The only solution was to reboot, and then the machine would freeze at the hd detection stage and I would have a white vertical line accross the screen. I sometime had to stop and restart 30 times before the machine would restart, or leave it switched off for some time, then restart. I first suspected a motherboard problem, but the problem remained after I moved from an MSI to an Epox. I then changed my video card, to no avail. The problem only disappeared when I removed the two cd drives, and changed from master/slave to cable select on both drives. Since then, everything has been fine. I don't know if this makes any sense, I just know that my ploblem has disappeared. FX -- ______________________ Courtesy of SuSE Linux http://www.nibz.org
On Saturday 08 March 2003 15:24, FX Fraipont wrote:
I had something similar which bugged me for months, but involved not the ide hd's, but the cdrom and cdrw on the second ide channel (hdc and hdd). I have a Pioneer cdrom drive (hdc) as master and an Aopen cdrw (hdd) as slave. Ide-scsi is used for the cdrw.
<snip> symptoms </snip>
I first suspected a motherboard problem, but the problem remained after I moved from an MSI to an Epox. I then changed my video card, to no avail.
The problem only disappeared when I removed the two cd drives, and changed from master/slave to cable select on both drives. Since then, everything has been fine.
I don't know if this makes any sense, I just know that my ploblem has disappeared.
FX
Yes it does make sense - someone else emailed me direct with a similar problem of the CD roms causing HD problems. When you removed the two CD drives did you just leave them out and use cable select on the hard drives. Or did you mean that you replaced them on the second IDE with cable-select enabled? Thanks for the info! -- Ray
Ray Poynter wrote:
Or did you mean that you replaced them on the second IDE with cable-select enabled?
Yup. No change to the hard drives: hdb slave of hda. But changed cd reader and writer from master/slave to cable select/cable select. FX ______________________ Courtesy of SuSE Linux http://www.nibz.org
On Sat, 08 Mar 2003 20:54:45 +0100
FX Fraipont
Ray Poynter wrote:
Or did you mean that you replaced them on the second IDE with cable-select enabled?
Yup. No change to the hard drives: hdb slave of hda.
But changed cd reader and writer from master/slave to cable select/cable select.
I'm curious........ Why do people use cable/select? I've always had trouble unless I use the normal master/slave settings? -- use Perl; #powerful programmable prestidigitation
On Sunday 09 March 2003 10:23, zentara wrote:
On Sat, 08 Mar 2003 20:54:45 +0100
FX Fraipont
wrote: Ray Poynter wrote:
Or did you mean that you replaced them on the second IDE with cable-select enabled?
Yup. No change to the hard drives: hdb slave of hda.
But changed cd reader and writer from master/slave to cable select/cable select.
I'm curious........ Why do people use cable/select? I've always had trouble unless I use the normal master/slave settings?
Not really sure - and like you I have always used the master/slave settings to avoid trouble. I think that it is mainly of use in manufacturers mass assembly lines where they can set the jumpers of all the drives to CS and so the people on the production line can just put them in the appropriate place in the box. No idea whether that is really true or not! Not sure if there are any other advantages. I have googled for info on this (in investigating my crashing problem) and there is lots of conflicting information out there! -- Ray
On Saturday 08 March 2003 06:02 am, Ray Poynter wrote:
Hi all
I am (still) having problems with my system locking up randomly and requiring a hard reset (or even a power off for several seconds). This sometimes does not happen for a few days but sometimes does it every few minutes for a while. The system stops responding - even ctrl-alt-F2/del does not do anything. ctrl-alt-esc returns a corrupt graphics screen and nothing else happens. A hard reset usually works but sometimes the BIOS cannot recognize the IDE0 boot record and I have to power off the box for 20 seconds or so.
I am using SuSE 8.1 with the latest athlon kernel (SuSE patched rpm version - k_athlon-2.4.19-167 I think). I have the KT333 VIA chipset with the 8235 southbridge, which has some issues, but that should have been addressed by *********************much trimmed*********************
Dear Ray, Boy this strongly suggests H/W. Be sure to check that the PSU (power supply) is properly functioning. Heat can cause these probs too. I can't see how S/W can be involved in "not ready" conditions. Luck.............. PeterB
participants (5)
-
FX Fraipont
-
Peter B Van Campen
-
Ray
-
Ray Poynter
-
zentara