Hi! Trying to kill the keyboard, laszlo@idt.net produced:
On Sun, 31 May 1998, Wolfgang Weisselberg wrote:
[crash of system, regularly, even after reinstalls]
general protection error: 0000 cpu : 0 eip : 0010:[<00124c207>] [...]
Worked on the overheating scenario first, before mucking with the hardware. Removed the case, Pentium II cpu is running coolish to the touch, all cooling fans running smoothly. Increased space on rack between towers for better ventilation.
Coolish directly after switching on or after a few hours?
I set up a cron job to send a mail at regular intervals to indicate when. the system becomes non-repsonsive. Outside temperature was in the low 50's Farenheit. Air conditioner was turned off to remove potential brown out conditions, power spikes. Room temperature remained cool.
Ok, it ain't the A/C. Hmmm ... How reliable is your electrical power (over here you can easily get many months uptime without an UPS) or are you using an UPS?
In spite of all this system had become non-repsonsive between 4:00 AM and 4:15 am.
Have you any info of the times the system crashed previously? Is it connected to a LAN (where others could access it) or the Internet? What do the logs show around the crash time --- anything strange, any connections (apart from what you already wrote)?
The system behavior was a bit different. Consoles allow you to switch between them <alt><F?> and allow you work thru the login sequence, reporting past failed login attempts and new mail notification. However, It seems that the shell(bash) never gets started, since there is no prompt and lack of keyboard/mouse response in that console. All network access telnet, httpd, etc. is dead, however, the server still responds to pings.
That means the kernel is somehow still working (at least partially) ... but inetd is dead.
/var/log/messages shows the faxclean queue being checked ( There is no queue ) without fail.
So that is still running? Hmmm ...
After my first console login attempt the following error messages were generated every couple of seconds:
Jun 2 11:13:14 www kernel: wait_queue is bad ( eip=0018948b ) q= 03fb8934 *q= 03f7cf68 [every 2-3 secs] That sounds bad, it means that the scheduler is unhappy. Something strange's going on. Have a look into /usr/src/linux/kernel/sched.c to see where these messages are generated ...
It could be a problem with hardware ... actually, I think it is. If you have not done so already, you might want to recompile your kernel, configured exclusively for your machine.
Prior to all this there was an error message
Which one? (Come on, don't let me hang on that cliff ... :-)
Are you 110ure your memory chip(s) is/are OK (no, works under Win does
Playing with the hardware is my next step.
Ok. -Wolfgang -- PGP 2 welcome: Mail me, subject "send PGP-key". If you've nothing at all to hide, you must be boring. Unsolicited Bulk E-Mails: *You* pay for ads you never wanted. Is our economy _so_ weak we have to tolerate SPAMMERS? I guess not. -- To get out of this list, please send email to majordomo@suse.com with this text in its body: unsubscribe suse-linux-e