[opensuse] unexpected server reboot
Hello, I have a minor but strange problem. I have a personal server for my LUG, hosted behind an adsl line physically, it's located in a closed room. The AC outlet is backed with a small UPS (alas with no communication channel). This server is pretty complex for such use, it's done of a 12.1 host used as gateway (also for local network) VirtualBox and the real server being the guest, also 12.1. from time to time (once a month?) I get an Inn message (Boot-time Usenet warning on savage-reborn : Old .news.daily file; need to run news.daily?), that make me know the server rebooted. I could verufy using uptime that the host was rebooted. I guess the openSUSE VirtualBox management works reliabely because the guest and the host have the very same uptime. I have nothing to do to have the server up again. In /var/log/message, I find no clue about why did the server reboot, but some strange time jump like: Apr 24 15:46:03 logrotate: last message repeated 2 times Apr 24 15:20:38 mulet-reborn kernel: imklog 5.8.5, log source = /proc/kmsg started. I guess the second line is the first reboot log and the time is later synced by ntp the reboot may as well be done by someby switching off the main AC (I do not own the whole house - there is even a switch out side on a road box :-() aby idea of how I could monitor this better? thanks jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-04-24 18:21, jdd wrote:
Apr 24 15:46:03 logrotate: last message repeated 2 times Apr 24 15:20:38 mulet-reborn kernel: imklog 5.8.5, log source = /proc/kmsg started.
Time going back 20 minutes? You have a clock problem. By the way, I use a trick to detect when the system boots, different in 12.1 than in previous versions. I edit root's crontab and I put this: SHELL=/bin/bash MAILTO="cer" - -@reboot /bin/logger -t marker -p syslog.warn "Booting the system now ==========================" > /dev/null (one line, mail wraps) So I can find a marker in the logs. I use another one for hibernation. What I can not do, or don't know how, is detecting if it was a normal halt or abnormal. With systemv I used "/etc/init.d/boot.local" and "/etc/init.d/halt.local", both wrote a line to the logs. If I see the boot line but not the halt line, it was a crash.
the reboot may as well be done by someby switching off the main AC (I do not own the whole house - there is even a switch out side on a road box :-()
You need a managed UPS, to make sure that it was a power failure. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk+W7NYACgkQIvFNjefEBxr/AQCgs/FdLhGnmoVRh9a5DRwIOb1l DhgAnR58QbdHIe6xSe6lhGAvpnFfN54B =Ud78 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 24/04/2012 20:11, Carlos E. R. a écrit :
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2012-04-24 18:21, jdd wrote:
Apr 24 15:46:03 logrotate: last message repeated 2 times Apr 24 15:20:38 mulet-reborn kernel: imklog 5.8.5, log source = /proc/kmsg started.
Time going back 20 minutes? You have a clock problem.
looks like the bios time is not updated by the ntp system. A way to verify this on command line (after launch, date is ok)? thanks jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hello, On Tue, 24 Apr 2012, jdd wrote:
Le 24/04/2012 20:11, Carlos E. R. a écrit :
Time going back 20 minutes? You have a clock problem.
looks like the bios time is not updated by the ntp system.
Is SYSTOHC="yes" set in /etc/sysconfig/clock? -dnh -- If human beings don't keep exercising their lips, he thought, their mouths probably seize up. After a few months' consideration and observation he abandonded this theory in favor of a new one. If they don't keep on exercising their lips, he thought, their brains start working. -- THHGTTG -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 24/04/2012 22:04, David Haller a écrit :
Hello,
On Tue, 24 Apr 2012, jdd wrote:
Le 24/04/2012 20:11, Carlos E. R. a écrit :
Time going back 20 minutes? You have a clock problem.
looks like the bios time is not updated by the ntp system.
Is SYSTOHC="yes" set in /etc/sysconfig/clock?
-dnh
yes, it is. But we speak of unexpected reboot, here, so if nice time is only written at shutdown, the server will neraly never see it :-( in all the "time" commands, what is the one that write the HW clock? (I can put it in a cron job :-) thanks jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hello, On Tue, 24 Apr 2012, jdd wrote:
in all the "time" commands, what is the one that write the HW clock? (I can put it in a cron job :-)
hwclock -w, --systohc set the hardware clock from the current system time HTH, -dnh -- This space intentionally left aligned. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-04-25 00:06, David Haller wrote:
Hello,
On Tue, 24 Apr 2012, jdd wrote:
in all the "time" commands, what is the one that write the HW clock? (I can put it in a cron job :-)
hwclock
-w, --systohc set the hardware clock from the current system time
Read this section of the manual first: Automatic Hardware Clock Synchronization By the Kernel You should be aware of another way that the Hardware Clock is kept synchronized in some systems. The Linux kernel has a mode wherein it copies the System Time to the Hardware Clock every 11 minutes. This is a good mode to use when you are using something sophisticated like ntp to keep your System Time synchronized. (ntp is a way to keep your System Time synchronized either to a time server somewhere on the network or to a radio clock hooked up to your system. See RFC 1305). This mode (we'll call it "11 minute mode") is off until something turns it on. The ntp daemon xntpd is one thing that turns it on. You can turn it off by running anything, including hwclock --hctosys, that sets the System Time the old fashioned way. To see if it is on or off, use the command adjtimex --print and look at the value of "status". If the "64" bit of this number (expressed in binary) equal to 0, 11 minute mode is on. Otherwise, it is off. If your system runs with 11 minute mode on, don't use hwclock --adjust or hwclock --hctosys. You'll just make a mess. It is acceptable to use a hwclock --hctosys at startup time to get a reasonable System Time until your system is able to set the System Time from the external source and start 11 minute mode. So, if you are running ntp the kernel should be in this 11 minute sync mode, and adjusting the cmos clock is not needed. I think there is a variable in /proc that verifies this state, but I don't remember which. On the other hand, if the system which clock is off is a guest, then do not use neither, but sync to the host instead using the virtualization software method they provide. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk+XKmYACgkQIvFNjefEBxraMACeI9LUjoJCIuBPnzbRPQWpBq2N e4kAoNJqJZu9uq/1nFMtVUjFgn/8gRfe =anm0 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 25/04/2012 00:34, Carlos E. R. a écrit :
To see if it is on or off, use the command adjtimex --print and look at the value of "status". If the "64" bit of this number (expressed in binary) equal to 0, 11 minute mode is on. Otherwise, it is off.
criptic :-( adjtimex --print mode: 0 offset: 154133 frequency: -597633 maxerror: 156660 esterror: 537 status: 24577 time_constant: 10 precision: 1 tolerance: 32768000 tick: 10000 raw time: 1335339989s 55886229us = 1335339989.55886229
So, if you are running ntp the kernel should be in this 11 minute sync mode,
yes, it should if the above number is good and if summer time is setup by the system later thanks jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 25/04/2012 00:06, David Haller a écrit :
Hello,
On Tue, 24 Apr 2012, jdd wrote:
in all the "time" commands, what is the one that write the HW clock? (I can put it in a cron job :-)
hwclock
-w, --systohc set the hardware clock from the current system time
HTH, -dnh
ok, very good, thanks. However I have an other question :-( how is managed the "summer time" offset? I mean the hwclock is one hour back the date time. according to yast it's setup as local time. so is this hw time normal given we are now on summer time? I don't find the answer in the hwclock man page -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
jdd wrote:
Le 25/04/2012 00:06, David Haller a écrit :
Hello,
On Tue, 24 Apr 2012, jdd wrote:
in all the "time" commands, what is the one that write the HW clock? (I can put it in a cron job :-)
hwclock
-w, --systohc set the hardware clock from the current system time
HTH, -dnh
ok, very good, thanks.
However I have an other question :-(
how is managed the "summer time" offset?
I mean the hwclock is one hour back the date time. according to yast it's setup as local time.
so is this hw time normal given we are now on summer time?
I don't find the answer in the hwclock man page
Your hardware clock should be running UTC, your timezone/locale setting will then determine how the time is displayed/interpreted, including any daylight savings time offset. -- Per Jessen, Zürich (12.6°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-04-25 09:45, jdd wrote:
Le 25/04/2012 00:06, David Haller a écrit :
However I have an other question :-(
how is managed the "summer time" offset?
I mean the hwclock is one hour back the date time. according to yast it's setup as local time.
so is this hw time normal given we are now on summer time?
I don't find the answer in the hwclock man page
No, because it is assumed you use UTC time. Why are you using local time on a 24*7 server? Local time is of use only when you double boot with Windows. The handling of local time in the bios clock is complex. Basically it should be copied from the system clock periodically, so when the summer time change comes, it is changed too - I suppose, I haven't studied that. The complexity is when booting up is calculating the corresponding UTC time and setting the clock, and not making errors when multiple systems are in play, or with a laptop on the move crossing time zones. For your case, it is safer to use UTC. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk+XzecACgkQIvFNjefEBxr7tQCeJKivy2FAeEOFRTpuZo2HpNoi uJcAn0v85axczKcN85TPWPbg/PZ//Alg =ZWqs -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 25/04/2012 12:11, Carlos E. R. a écrit :
No, because it is assumed you use UTC time. Why are you using local time on a 24*7 server? Local time is of use only when you double boot with Windows.
no reason else than the previous use of the computer as windows machine and no later change. May I change this from Linux (with hwclock) without disturbing the bios? probably so but if you can confirm? jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-04-25 12:41, jdd wrote:
Le 25/04/2012 12:11, Carlos E. R. a écrit :
May I change this from Linux (with hwclock) without disturbing the bios? probably so but if you can confirm?
Do it in yast. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk+X2UgACgkQIvFNjefEBxp7VwCfaZrsnYLyZJED5a2SlJRLd07c 33IAoJ0HXhY0zY2J+KoItBW0cyRqmt/p =0qAc -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 25/04/2012 13:00, Carlos E. R. a écrit :
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2012-04-25 12:41, jdd wrote:
Le 25/04/2012 12:11, Carlos E. R. a écrit :
May I change this from Linux (with hwclock) without disturbing the bios? probably so but if you can confirm?
Do it in yast.
I really have a problem, here. right now my local time is 08:51 I set the hwclock to 06:51 with "-u" option to fix it as UTC (I'm in Paris time zone). I get hwclock 06:51, date 08:51 ok. But if I load yast, click UTC, go to ntp server and set it up, then exit yast cleanly, I get back the hwclowk to 8:51 did I find a bug or did I miss or misunderstood something? thanks jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-04-26 08:54, jdd wrote:
Le 25/04/2012 13:00, Carlos E. R. a écrit :
Do it in yast.
I really have a problem, here.
right now my local time is 08:51
I set the hwclock to 06:51 with "-u" option to fix it as UTC (I'm in Paris time zone).
I get hwclock 06:51, date 08:51 ok.
Yes, same as me.
But if I load yast, click UTC, go to ntp server and set it up, then exit yast cleanly, I get back the hwclowk to 8:51
Huh?
did I find a bug or did I miss or misunderstood something?
Something wrong there, this should not happen. Check "/etc/sysconfig/clock", you should have this: HWCLOCK="-u" If you have that, it is correct. Then you need to also run "mkinitrd". And then copy the system clock to the cmos, and finally remove the "/etc/adjtime" so that it is recreated correctly. A reboot is not necessary, but I would do it to verify that all is correct, lest it reboots when you are not expecting it and find a problem. Something really weird in my machine, too! Telcontar:~ # hwclock -r Thu Apr 26 10:38:45 2012 -0.161071 seconds cer@Telcontar:~> cat /etc/adjtime 0.009859 1328407460 0.000000 1328407460 UTC It is 10:38 local time, too. And the variable is set to utc. Something is wrong. You may have discovered a bug! Or does hwclock displays time corrected to the local time? :-? - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk+ZCqwACgkQIvFNjefEBxoR9gCgzmNTgcpAtZuZd6YZDkZqf3SX KagAnjyh1Zo7W+srtDWOcRjdUwjkfsZq =XiII -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 26/04/2012 10:43, Carlos E. R. a écrit :
Check "/etc/sysconfig/clock", you should have this:
HWCLOCK="-u"
I have : HWCLOCK="-u" SYSTOHC="yes" TIMEZONE="Europe/Paris" DEFAULT_TIMEZONE="Europe/Paris"
Or does hwclock displays time corrected to the local time? :-?
-r, --show Read the Hardware Clock and print the time on standard output. The time shown is always in local time, even if you keep your Hardware Clock in Coordinated Universal Time. seem the answer, but then one have to guess the real bios time (or reboot to see the bios if possible!) thanks jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-04-26 10:58, jdd wrote:
Le 26/04/2012 10:43, Carlos E. R. a écrit :
-r, --show Read the Hardware Clock and print the time on standard output. The time shown is always in local time, even if you keep your Hardware Clock in Coordinated Universal Time.
seem the answer, but then one have to guess the real bios time (or reboot to see the bios if possible!)
Ah, no bug them. Good! Yes, it is confusing. Ah! Try "-D" for debug, it gives the real time. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk+ZFL8ACgkQIvFNjefEBxpmfgCeMq5gmMe6Yostjbhl06ua1vSN BU0AoM9n4lkouYUkmGbIzDjkLu5FquJG =93aP -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
* jdd
the reboot may as well be done by someby switching off the main AC (I do not own the whole house - there is even a switch out side on a road box :-()
aby idea of how I could monitor this better?
put a script to write a log entry into /etc/init.d/halt.local -- (paka)Patrick Shanahan Plainfield, Indiana, USA HOG # US1244711 http://wahoo.no-ip.org Photo Album: http://wahoo.no-ip.org/gallery2 http://en.opensuse.org openSUSE Community Member Registered Linux User #207535 @ http://linuxcounter.net -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
If it is a question of power interruptions, that is what uninterruptable power supplies are for. Stores that carry IT equipment, like Staples and Future shop have units that are decent quality and modest price (there are keyboards that are more expensive). They provide both surge suppression and 'almost' uninterruptable power. But that I mean the power won't be suddenly cut off to your computer. Rather, depending on the capacity of the UPS, your machine can keep running for half an hour to several hours on the battery in the UPS. They can usually talk to your computer, again normally through a USB port, so when the UPS can no longer keep your machine going, it tells the computer to either shut itself off or hibernate or go into standby mode. I have my workstation plugged into a smaller unit, and it hibernates whenever the power is out for more than 5 minutes. I have several servers plugged into a much more capable UPS, which will keep them going for an hour (so I can shut them down manually, if the power is going to be out for any length of time. I don't know Linux well enough, but I would suppose that if you can catch system messages, you could log these, or if you can program the USB port (to listen on anything sent over the USB connection to the UPS, you can log that. Just a thought.... Cheers Ted
-----Original Message----- From: Patrick Shanahan [mailto:paka@opensuse.org] Sent: April-24-12 3:50 PM To: opensuse@opensuse.org Subject: Re: [opensuse] unexpected server reboot
* jdd
[04-24-12 12:23]: ... the reboot may as well be done by someby switching off the main AC (I do not own the whole house - there is even a switch out side on a road box :-()
aby idea of how I could monitor this better?
put a script to write a log entry into /etc/init.d/halt.local
-- (paka)Patrick Shanahan Plainfield, Indiana, USA HOG # US1244711 http://wahoo.no-ip.org Photo Album: http://wahoo.no-ip.org/gallery2 http://en.opensuse.org openSUSE Community Member Registered Linux User #207535 @ http://linuxcounter.net -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 24/04/2012 22:09, Ted Byers a écrit :
If it is a question of power interruptions, that is what uninterruptable power supplies are for.
As I said, last time I checked, the server was plugged in an UPS, but this model do not have usable communication, and may be not more than 5 mn time lapse. time ago we had a project to sense the AC failure by the fact that the modem will also be down, may be I will have to revive this I will have to experiment AC failure on site, but I pretty fear such brutal thing :-( thanks jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-04-24 21:50, Patrick Shanahan wrote:
put a script to write a log entry into /etc/init.d/halt.local
That is what I said, but will not work in 12.1 with systemd. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk+XFVkACgkQIvFNjefEBxovPgCgrx1GaT1CP/HaLAl4wof/DYy1 vxQAn0dyBjQx6kIY+WYmyZ9rlvdMlaJo =TKB6 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 24/04/2012 23:04, Carlos E. R. a écrit :
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2012-04-24 21:50, Patrick Shanahan wrote:
put a script to write a log entry into /etc/init.d/halt.local
That is what I said, but will not work in 12.1 with systemd.
anyway halt.local will never been executed in case of AC failure in my case the stop may be done by some people rebooting the server (unlikely, very few people have the local key) or power failure too long for the UPS. or ? thanks jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-04-24 23:12, jdd wrote:
Le 24/04/2012 23:04, Carlos E. R. a écrit :
anyway halt.local will never been executed in case of AC failure
Exactly, and I said that. When you see the boot line without the halt line, you immediately know that it crashed or something drastic. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAk+XGMEACgkQIvFNjefEBxoFEACgvlcPgSDIv8sWrPdtO2z14amG 4JkAoNkuzGZDjgsLoI2F4n0/O1GWGW5N =2xC2 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (6)
-
Carlos E. R.
-
David Haller
-
jdd
-
Patrick Shanahan
-
Per Jessen
-
Ted Byers