Message-ID: <1100D69203AAD2118E3C00508B8B9E8A17093A@mailhost.intech.unu.edu>
From: "Heupink, Mourik Jan C."
Date: Mon, 13 Nov 2000 15:29:41 +0100
Subject: FW: SuSE 6.2, kernel 2.2.17, RAID 1, random crashes suddenly star t happening
Hello all.
We are running a suse system on a compaq prosignia server. /home partition
is raid 1, rest is ordinary fs2. It has been running for a while (months)
without any problems.
BUT: In the last two weeks all of a sudden it has stopped responding almost
completely for two times. I am able to change virtual terminals (ctrl-Fx),
type a username, but after i press enter, nothings happens anymore.
Ctrl-alt-del doesn't work anymore, cannot telnet, cannot ping, http is down.
(but since i CAN type a username, and change terminal, it's not completely
dead, i think?)
Anyway: some pieces from various logfiles:
messages:
Nov 12 00:28:59 intech003 -- MARK --
Nov 12 01:41:26 intech003 -- MARK --
Nov 12 02:31:38 intech003 -- MARK --
Nov 12 03:11:44 intech003 -- MARK --
Nov 12 03:53:14 intech003 -- MARK --
Nov 12 04:33:14 intech003 -- MARK --
Nov 12 05:12:11 intech003 -- MARK --
Nov 12 05:51:47 intech003 -- MARK --
Nov 12 06:29:44 intech003 -- MARK --
Nov 12 07:09:21 intech003 -- MARK --
Nov 12 07:47:51 intech003 -- MARK --
Nov 12 08:25:52 intech003 -- MARK --
Nov 12 09:04:56 intech003 -- MARK --
Nov 12 09:44:08 intech003 -- MARK --
Nov 12 10:24:09 intech003 -- MARK --
Nov 12 11:03:55 intech003 -- MARK --
Nov 12 11:44:49 intech003 -- MARK --
Nov 13 09:51:28 intech003 syslogd 1.3-3: restart.
Nov 13 09:51:29 intech003 kernel: klogd 1.3-3, log source = /proc/kmsg
started.
Nov 13 09:51:29 intech003 kernel: Inspecting /boot/System.map
Nov 13 09:51:30 intech003 kernel: Symbol table has incorrect version number.
Nov 13 09:51:30 intech003 kernel: Cannot find map file.warn:
Nov 10 14:55:25 intech003 login[14557]: pam_unix session finished for user
xxxxx, service login
Nov 10 17:51:51 intech003 login: pam_unix session started for user root,
service login
Nov 10 17:54:42 intech003 login[11696]: pam_unix session finished for user
root, service login
Nov 10 17:54:53 intech003 login[206]: pam_unix session finished for user
root, service login
note the gap between 11:44 and 09:51.
also in warn:
Nov 3 22:21:56 intech003 /usr/sbin/gpm[152]: Error in protocol
(but i'm sure that has nothing to do with it?) (since gpm was running
without a mouse connected... sorry!) (changed it already)
Could it have to do with the raid setup? I've made a custom kernel, but
didn't change it during the last months.
Versions:
mdutils 0.41, release 68, build date: fri aug 6, 23:48:57
<p>Furthermore, I did apply the security updates from SuSE. (aaa_base, apache,
cron, gpm, libc, mod_perl, perl and shlibs) (from
ftp://ftp.suse.com/pub/suse/i386/update/6.2/)
(maybe that broke things? applied them using yast, and got no error
messages)
Anyway: I hope any of you people can shed some light on this issue. (because
at the moment our nt-servers run more stable than the linux server.....!)
Thanks very much in advance,
Mourik Jan
UNU/INTECH