[opensuse-support] Network watchdog
Hi List, Last november, my remote home server went offline. When I finally could access it physically, the machine was all fine, just the connection to the outside world had been dead. A simple 'systemctl restart network' did bring it back (it's a wicked-managed dhcp connection). Now I wonder if there is some daemon existing that regularly checks if the outside connection is properly working and, if not, restarts the network. I'd have expected systemctl to do something like that. Or is it so simple that I should just write a cron script that does this? The system is 42.3 -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
Le 08/01/2019 à 11:36, Peter Suetterlin a écrit :
that I should just write a cron script that does this?
simple ping to your host provider :-) jdd -- http://dodin.org -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
jdd@dodin.org wrote:
Le 08/01/2019 à 11:36, Peter Suetterlin a écrit :
that I should just write a cron script that does this?
simple ping to your host provider :-)
Sure, that's not the (real) point - rather that I think a server OS should offer such functionality by default somewhere. So before handcrafting (which I do a lot...) I thought I ask for the official/proper solution ;^> -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 08/01/2019 11.36, Peter Suetterlin wrote:
Hi List,
Last november, my remote home server went offline. When I finally could access it physically, the machine was all fine, just the connection to the outside world had been dead. A simple 'systemctl restart network' did bring it back (it's a wicked-managed dhcp connection).
Now I wonder if there is some daemon existing that regularly checks if the outside connection is properly working and, if not, restarts the network. I'd have expected systemctl to do something like that. Or is it so simple that I should just write a cron script that does this?
Wow. I wrote my own daemon that pings the router and google periodically. My problem is not the network dying on the computer, but the home router locking up, though, so my daemon would not work for you. My home server has not acted up in the way you describe, using Leap 15.0. However, my small laptop sometimes does not connect to the network on restore from hibernate and I have to manually restart the network. I think you should change your setup to fixed address, not dhcp. -- Cheers / Saludos, Carlos E. R. (from 15.0 x86_64 at Telcontar)
Carlos E. R. wrote:
On 08/01/2019 11.36, Peter Suetterlin wrote:
Hi List,
Last november, my remote home server went offline. When I finally could access it physically, the machine was all fine, just the connection to the outside world had been dead. A simple 'systemctl restart network' did bring it back (it's a wicked-managed dhcp connection).
Now I wonder if there is some daemon existing that regularly checks if the outside connection is properly working and, if not, restarts the network. I'd have expected systemctl to do something like that. Or is it so simple that I should just write a cron script that does this?
Wow.
I wrote my own daemon that pings the router and google periodically. My problem is not the network dying on the computer, but the home router locking up, though, so my daemon would not work for you.
Hehe, had that with the Movistar ADSL modem. I finally connected it to a wallplug timer that shut power down for one minute every day, to restart the modem :D
My home server has not acted up in the way you describe, using Leap 15.0. However, my small laptop sometimes does not connect to the network on restore from hibernate and I have to manually restart the network.
I'm not sure what actually happened, there was absolutely nothing in the logs. Just no more firewall logs of dropped packages, and I couldn't get in from outside. It had been running without issues for 200 days or so.
I think you should change your setup to fixed address, not dhcp.
That's not possible. This is in Sweden, the internet is included with the appartement, and only handed out via dhcp. Nothing I have influence on. -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 08/01/2019 12.00, Peter Suetterlin wrote:
Carlos E. R. wrote:
On 08/01/2019 11.36, Peter Suetterlin wrote:
Hi List,
Last november, my remote home server went offline. When I finally could access it physically, the machine was all fine, just the connection to the outside world had been dead. A simple 'systemctl restart network' did bring it back (it's a wicked-managed dhcp connection).
Now I wonder if there is some daemon existing that regularly checks if the outside connection is properly working and, if not, restarts the network. I'd have expected systemctl to do something like that. Or is it so simple that I should just write a cron script that does this?
Wow.
I wrote my own daemon that pings the router and google periodically. My problem is not the network dying on the computer, but the home router locking up, though, so my daemon would not work for you.
Hehe, had that with the Movistar ADSL modem. I finally connected it to a wallplug timer that shut power down for one minute every day, to restart the modem :D
I considered that :-D It is the Movistar /fibre/ router. Actual fibre is not connected to it, but to a device called ONT that goes before it converting to ethernet. But it would disrupt the server workflow, so instead I finally found a power strip controllable via ethernet (LAN), so I wrote code using that to power cycle the router and other things attached to it. Fortunately the switch side of the router stays alive long enough for my program to work. It may act a few times a day, or not at all in a month. Someone actually sells watchdogs that do that on their own, quite expensive; the problem is not rare.
My home server has not acted up in the way you describe, using Leap 15.0. However, my small laptop sometimes does not connect to the network on restore from hibernate and I have to manually restart the network.
I'm not sure what actually happened, there was absolutely nothing in the logs. Just no more firewall logs of dropped packages, and I couldn't get in from outside. It had been running without issues for 200 days or so.
:-(
I think you should change your setup to fixed address, not dhcp.
That's not possible. This is in Sweden, the internet is included with the appartement, and only handed out via dhcp. Nothing I have influence on.
Ah. I thought you were talking of the internal IP your side of your router - I assume you have one? Then yes, you need a daemon/cron/whatever. If you don't find anything, you could adapt my program, but it is in Pascal: instead of calling the external program to cycle power, call network restart. The modification is almost trivial. -- Cheers / Saludos, Carlos E. R. (from 15.0 x86_64 at Telcontar)
Carlos E. R. wrote:
I considered that :-D
It is the Movistar /fibre/ router. Actual fibre is not connected to it, but to a device called ONT that goes before it converting to ethernet.
Ah, can only dream of fiber. This is La Palma :o It will come. Mañana.
Someone actually sells watchdogs that do that on their own, quite expensive; the problem is not rare.
I had looked at a Pi based solution last year, for our observatory server, but never finalized it....
Then yes, you need a daemon/cron/whatever.
If you don't find anything, you could adapt my program, but it is in Pascal: instead of calling the external program to cycle power, call network restart. The modification is almost trivial.
So ATM I'm running #!/bin/sh ping -c 1 -w 1 -q 8.8.8.8 &> /dev/null if [ $? -ne 0 ]; then systemctl restart network fi in cron.daily. I'll see what happens. -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 08/01/2019 12.44, Peter Suetterlin wrote:
Carlos E. R. wrote:
Then yes, you need a daemon/cron/whatever.
If you don't find anything, you could adapt my program, but it is in Pascal: instead of calling the external program to cycle power, call network restart. The modification is almost trivial.
So ATM I'm running
#!/bin/sh ping -c 1 -w 1 -q 8.8.8.8 &> /dev/null if [ $? -ne 0 ]; then systemctl restart network fi
in cron.daily. I'll see what happens.
That may work. :-) Caveats: a single ping failure will restart the network. This may disrupt existing connections, possibly change the IP. There are other ping programs (I forget which exactly) that will try several pings - see below. If there is a local network, I would ping that first. You also need to consider what happens if cron triggers another job while this one is running if the frequency is high enough. ping variants from my notes: ping fwping (saint) ?? ping -f -l10 -s20000 router bing compute point to point throughput using two sizes of ICMP ECHO_REQUEST packets to a pair of remote hosts fping - A program to ping multiple hosts hping - Command-line oriented TCP/IP packet assembler/analyzer nping - Compare Results of Nmap Scans oping - Multiple Host Ping that supports ICMPv4 and ICMPv6 I think there are more. -- Cheers / Saludos, Carlos E. R. (from 15.0 x86_64 at Telcontar)
Adam Mizerski composed on 2019-01-08 22:25 (UTC+0100):
Peter Suetterlin composed:
ping -c 1 -w 1 -q 8.8.8.8 &> /dev/null
Instead of pinging google, you can use conncheck.opensuse.org. It's used by NetworkManager in openSUSE (see NetworkManager-branding-openSUSE).
That confirms the routine annoying latency between here & opensuse.org: --- google.com ping statistics --- 7 packets transmitted, 7 received, 0% packet loss, time 6004ms rtt min/avg/max/mdev = 23.423/24.004/24.610/0.433 ms --- proxy-nue.opensuse.org ping statistics --- 7 packets transmitted, 7 received, 0% packet loss, time 6002ms rtt min/avg/max/mdev = 125.227/126.477/129.225/1.322 ms -- Evolution as taught in public schools is religion, not science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On Tue, 8 Jan 2019 16:45:32 -0500
Felix Miata
Adam Mizerski composed on 2019-01-08 22:25 (UTC+0100):
Peter Suetterlin composed:
ping -c 1 -w 1 -q 8.8.8.8 &> /dev/null
Instead of pinging google, you can use conncheck.opensuse.org. It's used by NetworkManager in openSUSE (see NetworkManager-branding-openSUSE).
That confirms the routine annoying latency between here & opensuse.org:
--- google.com ping statistics --- 7 packets transmitted, 7 received, 0% packet loss, time 6004ms rtt min/avg/max/mdev = 23.423/24.004/24.610/0.433 ms
--- proxy-nue.opensuse.org ping statistics --- 7 packets transmitted, 7 received, 0% packet loss, time 6002ms rtt min/avg/max/mdev = 125.227/126.477/129.225/1.322 ms
Yeah, not so bad here but still ... --- proxy-nue.opensuse.org ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 42.143/42.143/42.143/0.000 ms --- google.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 29.091/29.091/29.091/0.000 ms -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 09/01/2019 00.09, Dave Howorth wrote:
On Tue, 8 Jan 2019 16:45:32 -0500 Felix Miata <> wrote:
Adam Mizerski composed on 2019-01-08 22:25 (UTC+0100):
Peter Suetterlin composed:
...
Yeah, not so bad here but still ...
--- proxy-nue.opensuse.org ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 42.143/42.143/42.143/0.000 ms
--- google.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 29.091/29.091/29.091/0.000 ms
The problem with pinging an outside place on the watchdog is that, if the outside site fails, or if your ISP or some ISP on the route fails, your server gets into a useless and fast loop. -- Cheers / Saludos, Carlos E. R. (from 15.0 x86_64 at Telcontar)
Carlos E. R. wrote:
The problem with pinging an outside place on the watchdog is that, if the outside site fails, or if your ISP or some ISP on the route fails, your server gets into a useless and fast loop.
In my case the inside net was fine. It was only the outer IF that didn't receive anything anymore. So I do have to ping outside. 8.8.8.8 seems a very obvious choice for that. And as for the fast loop - that depends on the cron settings. ATM it's set to 'daily'. I might put it on 'hourly', but that is still not a really fast cadence, is it? It's not for a HA server - it only acts as hub in my VPN and central GIT repository, and nothing really breaks if it's down for some hours/days. -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 09/01/2019 12.42, Peter Suetterlin wrote:
Carlos E. R. wrote:
The problem with pinging an outside place on the watchdog is that, if the outside site fails, or if your ISP or some ISP on the route fails, your server gets into a useless and fast loop.
In my case the inside net was fine. It was only the outer IF that didn't receive anything anymore. So I do have to ping outside. 8.8.8.8 seems a very obvious choice for that. And as for the fast loop - that depends on the cron settings. ATM it's set to 'daily'. I might put it on 'hourly', but that is still not a really fast cadence, is it?
No, that's fine :-) My loop is in minutes. -- Cheers / Saludos, Carlos E. R. (from 15.0 x86_64 at Telcontar)
Adam Mizerski wrote:
W dniu 08.01.2019 o 12:44, Peter Suetterlin pisze:
ping -c 1 -w 1 -q 8.8.8.8 &> /dev/null
Instead of pinging google, you can use conncheck.opensuse.org. It's used by NetworkManager in openSUSE (see NetworkManager-branding-openSUSE).
I could. Two reasons: - At least something that google is good for (I try to stay away from it mostly) - Remember the old saying "If google doesn't work, the internet is broken" ;^> -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
participants (6)
-
Adam Mizerski
-
Carlos E. R.
-
Dave Howorth
-
Felix Miata
-
jdd@dodin.org
-
Peter Suetterlin