Server hangs when link is down
I am running a server with SuSE 6.2, kernel 2.2.10. The server is connected through a leased 64kbps data circuit. The circuit was down yesterday due to problems with the telephone company. But when the link to wan went down the server is virtually hung. It allowed people who are already logged in to work, but many commands didn't work, such as ps. I could not su to see what the problem is. Why does this happen? I suspect the problem could be due to DNS lookups. But the server in question runs a local DNS, so at least local logins should be allowed. It was not allowing logins even from the console. Any clues would be welcome! Thanks in advance! Nagarjuna
Nagarjuna,
I suspect the problem could be due to DNS lookups. But the server in question runs a local DNS, so at least local logins should be allowed. It was not allowing logins even from the console.
Was it still possible to switch the virtual consoles? Could you see what you type on the console? Was it possible to connect to the network servers, but there was no response (looks like: $ telnet gangmail 22 Trying 192.168.0.1... Connected to gangmail. Escape character is '^]'. ...but then nothing more shows up?)
Any clues would be welcome!
We need to know if the kernel really got stuck or not. Does the machine send syslog datagrams to somewhere else over the network? If we find out what happened, it could very well turn out to be a security problem, too.
Thanks in advance!
Nagarjuna
A kernel upgrade isn't a mistake in the first place, but is it a modem or a leased line? Viele Grüße, Roman. -- _ _ | Roman Drahtmüller "The best way to pay for a | CC University of Freiburg lovely moment is to enjoy it." | email: draht@uni-freiburg.de - Richard Bach | - -
On Tue, 6 Jun 2000, Nagarjuna G. wrote:
I am running a server with SuSE 6.2, kernel 2.2.10. The server is connected through a leased 64kbps data circuit. The circuit was down yesterday due to problems with the telephone company. But when the link to wan went down the server is virtually hung. It allowed people who are already logged in to work, but many commands didn't work, such as ps. I could not su to see what the problem is. Why does this happen?
I suspect the problem could be due to DNS lookups. But the server in question runs a local DNS, so at least local logins should be allowed. It was not allowing logins even from the console.
So you tell: People already logged in were able to continue their work except from some system commands. No new logins (even local ones) were possible. I observed these symptoms on our local network a few times - with a simple explanation: We use a lot of NFS mounts. When the link is down, these mounts are still active and some of the directories are included in the PATH as we are imporing binaries from a server. The broken link doesn't affect users working locally as long as they are not accessing mounted directories (system commands as df would do that). New logins are not possible as the shell searches the PATH to build a hashtable of executables. It is blocked (even the su shell in case you have mounted directory in the root PATH - which is certainly a security issue ;-)) until it gets a timeout. Maybe you will find a similar explanation to your problem - depending on the services you use permanetly through your "leased 64kbps data circuit" connection. Cheers, Thomas |--------------------------------------------------------------------------| | Thomas Forbriger email: Thomas.Forbriger@geophys.uni-stuttgart.de | | Universitaet Stuttgart - Institut fuer Geophysik | | Richard-Wagner-Str. 44 D-70184 Stuttgart Germany | | Tel ++49 (711) 121-3593 or 3422 or 3424 or 3590 | Fax ++49 (711) 2361218 | | http://www.geophys.uni-stuttgart.de/thof | | "... there's nothing more bizarre than reality..." (M. Kindermann) |
On Wed, 7 Jun 2000, Thomas Forbriger wrote: ->On Tue, 6 Jun 2000, Nagarjuna G. wrote: -> ->> ->> I am running a server with SuSE 6.2, kernel 2.2.10. The server is connected ->> through a leased 64kbps data circuit. The circuit was down yesterday due to ->> problems with the telephone company. But when the link to wan went down ->> the server is virtually hung. It allowed people who are already logged in to ->> work, but many commands didn't work, such as ps. I could not su to see what ->> the problem is. Why does this happen? -> ->So you tell: People already logged in were able to continue their work except ->from some system commands. No new logins (even local ones) were possible. I ->observed these symptoms on our local network a few times - with a simple ->explanation: We use a lot of NFS mounts. When the link is down, these mounts ->are still active and some of the directories are included in the PATH as we ->are imporing binaries from a server. The broken link doesn't affect users ->working locally as long as they are not accessing mounted directories (system ->commands as df would do that). New logins are not possible as the shell ->searches the PATH to build a hashtable of executables. It is blocked (even the ->su shell in case you have mounted directory in the root PATH - which is ->certainly a security issue ;-)) until it gets a timeout. We are using NFS, but not thorugh the link which is down. My problem must be somewhere else. Still no clue! Nagarjuna
Hi! Have you checked whether it is a problem with nameservices? We had a similar problem that was caused by an unreacheable nameserver. The machines of the intranet were WinNT-Boxes with an internet-connection trough a Linux gateway over a static phoneline. When the phoneline was down, no nameserver could be reached and so the whole net hung. Hope that helps! Greetings, Jürgen --------------------------------------------------------------------- Jürgen Ellinger Siemensstraße 44 88250 Weingarten e-mail: ellinger@informatik.uni-tuebingen.de ellinger@student.uni-tuebingen.de ellinger@spohn.rv.bw.schule.de
participants (4)
-
Juergen Ellinger
-
Nagarjuna G.
-
Roman Drahtmueller
-
Thomas Forbriger