RE: [SLE] HP Proliant stop responding
When the machine stops running: - it still has power - the power LED is still on - the CD tray opens - the link LED on both the NIC and the switch are on - there is not network traffic - the machine does not respond on the network - the screen is normally in power save mode and therefore blank - the machine does not respond to keyboard key presses - there is no lights flashing on the keyboard
Have anyone heard of this before? Or does anybody have an idea of what it could be?
I have the machine to my disposal for a couple of hours. During this time I need to remove all the data from it that we can send the machine to HP.
Sounds an awful lot like a kernel panic happened. One additional test that you might want to do: see if the Caps Lock and Num Lock keys still cause the keyboard LED to turn on. The CD tray open key, and the link light on the NIC, don't require a functional OS to work; the Caps Lock and Num Lock keys do. In the case of a kernel panic with a blanked screen, the caps lock and num lock keys won't cause their respective LEDs to light. I've seen this happen -- these symptoms exactly -- for two reasons: heat, and bad RAM. Memtest86 (which I think you can load from the grub boot menu by selecting the "test memory" option, or whatever it's called) should be able to diagnose the bad RAM. It'll take a looooonnnnngggggg time to run, so be prepared -- 4 or 6 or 8 hours wouldn't be a surprise. Assuming that both of the single processor machines are in the same room, heat seems unlikely, although it can vary from location to location. Also, check that the case fans aren't clogged with dirt/dust/other grime, and that they're all working. (I've had this happen to a total of 5 machines in a server farm; in 2 cases the culprit was bad RAM, in 3 it was heat. In each of the cases where the problem was heat, replacing a couple of case fans that were no longer functional solved the issue.)
participants (1)
-
Marlier, Ian