[opensuse] Ever see a disk-controller just die?

10 Aug 2015

      All,

   Sunday, I experienced a server failure and I am trying to deduce why? No 
storms, electrical interruptions, nothing... I received an automated text from 
the other boxes in my office that the server was down. I happened to have an 
open ssh session on my laptop and the connection was still up. So I typed

   $ uptime
   Bus error (core dumped)

WTF?

   It is an older server (MSI K9N2 SLI Platinum) w/Phenom 9850. Always been rock 
solid. Spinning 6 drives on the native controllers. 2 primary drives on 1T 
Carvair Black drives mdraid0 4 partitions, 2 Secondary drives on Seagate 250G 
drives with dmraid (fake raid) 4 partitions (old install) and finally 2 1.5T 
Seagates that were just spare miscellaneous storage attached to ESATA 5/6.

   Suspecting a drive failure, I pulled all but the primary drives. Boot hangs 
at the "Detecting Hard Drives" POST/BIOS point. Disconnected all drives and it 
boots fine ("No discs connected - insert system disk"). Suspecting one or the 
other of the primaries, I removed the second drive on the primary channel (stuck 
at "Detecting Hard Drives"), so reverse the config and remove the primary drive 
and reconnect the secondary (stuck at "Detecting Hard Drives"). Huh?

   Remove the secondary leaving all drives removed (boots just fine to "No discs 
connected - insert system disk").

   So this has me scratching my head. Unless I had simultaneous failure of both 
drives, it appears when anything is connected to the primary controller, boot 
hangs at "Detecting Hard Drives". (I have not tried discs alone on the secondary 
controller channel alone) ESATA is not bootable.

   The DVD is detected just fine and I've pulled all cards and memory and 
reseated just in case there was a stray bit of resistance somewhere.

   Has anyone experienced anything similar? If so, any pointers? I did have 1 
SATA cable go bad a year or so ago, but given my diagnostics, I can't see 2 
cables going bad at once.

   Any thoughts from the brain-trust are appreciated..

-- 
David C. Rankin, J.D.,P.E.
-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse+owner@opensuse.org

David C. Rankin

John Andersen

David C. Rankin

Felix Miata

Anton Aylward

Stevens

Anton Aylward

David C. Rankin

jdd

David C. Rankin

Carlos E. R.

David C. Rankin

Felix Miata

John Andersen

Carlos E. R.

John Andersen

David C. Rankin

David C. Rankin

Carlos E. R.

David C. Rankin

Felix Miata

David C. Rankin

Felix Miata

David C. Rankin

Anton Aylward

David C. Rankin

Felix Miata

jdd

Felix Miata

Carlos E. R.

James Knott

Carlos E. R.

doug

Anton Aylward

James Knott

Florian Gleixner

David C. Rankin

tags

participants (10)