[Bug 344356] New: using ntpd, some times the clock starts to go slow.
https://bugzilla.novell.com/show_bug.cgi?id=344356#c2 Summary: using ntpd, some times the clock starts to go slow. Product: openSUSE 10.3 Version: Final Platform: i686 OS/Version: openSUSE 10.3 Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: robin.listas@telefonica.net QAContact: qa@suse.de Found By: --- Sometimes, I see messages like this in the kernel log: Nov 27 15:27:32 nimrodel kernel: ACPI: Sleep Button (CM) [SLPB] Nov 27 15:57:46 nimrodel kernel: set_rtc_mmss: can't update from 1 to 57 Nov 27 15:58:00 nimrodel syslog-ng[10590]: last message repeated 13 times Nov 27 15:58:00 nimrodel kernel: set_rtc_mmss: can't update from 1 to 58 Nov 27 15:58:03 nimrodel syslog-ng[10590]: last message repeated 3 times Nov 27 15:58:04 nimrodel kernel: set_rtc_mmss: can't update from 1 to 58 Nov 27 15:58:04 nimrodel kernel: set_rtc_mmss: can't update from 1 to 58 Nov 27 15:58:06 nimrodel kernel: set_rtc_mmss: can't update from 2 to 58 Nov 27 15:59:00 nimrodel syslog-ng[10590]: last message repeated 53 times Nov 27 15:59:00 nimrodel kernel: set_rtc_mmss: can't update from 2 to 59 Nov 27 15:59:03 nimrodel syslog-ng[10590]: last message repeated 3 times Nov 27 15:59:04 nimrodel kernel: set_rtc_mmss: can't update from 2 to 59 Nov 27 15:59:04 nimrodel kernel: set_rtc_mmss: can't update from 2 to 59 Nov 27 15:59:06 nimrodel kernel: set_rtc_mmss: can't update from 3 to 59 Nov 27 16:00:03 nimrodel syslog-ng[10590]: last message repeated 15 times (no more entries in 90 minutes) About the same period, ntp stops being able to maintain sync, and aborts: 27 Nov 15:27:38 ntpd[12905]: system event 'event_restart' (0x01) status 'sync_alarm, sync_unspec, 1 event, event_unspec' (0xc010) 27 Nov 15:27:38 ntpd[12905]: peer LOCAL(0) event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014) 27 Nov 15:27:38 ntpd[12905]: Deleting interface #2 lo, ::1#123, interface stats: received=0, sent=0, dropped=0, active_time=1 secs 27 Nov 15:27:38 ntpd[12905]: Deleting interface #3 eth0, fe80::240:f4ff:fe2e:b121#123, interface stats: received=0, sent=0, dropped=0, active_time=1 secs 27 Nov 15:27:42 ntpd[12905]: peer 91.121.2.175 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014) 27 Nov 15:27:44 ntpd[12905]: peer 195.55.174.243 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014) 27 Nov 15:27:45 ntpd[12905]: peer 80.33.107.110 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014) 27 Nov 15:27:45 ntpd[12905]: peer 217.155.68.49 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014) 27 Nov 15:27:45 ntpd[12905]: peer 84.88.69.10 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014) 27 Nov 15:27:46 ntpd[12905]: peer 88.191.19.23 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014) 27 Nov 15:27:46 ntpd[12905]: peer 192.33.214.57 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014) 27 Nov 15:30:51 ntpd[12905]: system event 'event_peer/strat_chg' (0x04) status 'sync_alarm, sync_local_proto, 2 events, event_restart' (0xc521) 27 Nov 15:30:51 ntpd[12905]: synchronized to LOCAL(0), stratum 10 27 Nov 15:30:51 ntpd[12905]: kernel time sync status change 0001 27 Nov 15:30:51 ntpd[12905]: system event 'event_sync_chg' (0x03) status 'leap_none, sync_local_proto, 3 events, event_peer/strat_chg' (0x534) 27 Nov 15:30:51 ntpd[12905]: system event 'event_peer/strat_chg' (0x04) status 'leap_none, sync_local_proto, 4 events, event_sync_chg' (0x543) 27 Nov 15:31:56 ntpd[12905]: synchronized to 91.121.2.175, stratum 2 27 Nov 15:38:31 ntpd[12905]: synchronized to 192.33.214.57, stratum 2 27 Nov 15:39:25 ntpd[12905]: synchronized to 195.55.174.243, stratum 2 27 Nov 15:39:40 ntpd[12905]: synchronized to LOCAL(0), stratum 10 27 Nov 16:22:22 ntpd[12905]: synchronized to 84.88.69.10, stratum 2 27 Nov 16:22:22 ntpd[12905]: time correction of 1678 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. 27 Nov 16:22:22 ntpd[12905]: system event 'event_fault' (0x02) status 'leap_none, sync_ntp, 12 events, event_peer/strat_chg' (0x6c4) Then I have to restart it to force a time set: 27 Nov 17:13:54 ntpd[18877]: system event 'event_restart' (0x01) status 'sync_alarm, sync_unspec, 1 event, event_unspec' (0xc010) 27 Nov 17:13:55 ntpd[18877]: peer LOCAL(0) event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0x8014) 27 Nov 17:13:55 ntpd[18877]: Deleting interface #2 lo, ::1#123, interface stats: received=0, sent=0, dropped=0, active_time=1 secs The clock was going about half an hour slow - during my lunch hour! This thing started happening with 10.3, and is random. Normally, I'm not present or watching when it happens. Once I was scanning a page via a usb scanner, and the desktop froze. Sometimes the desktop freezes (the cpu applet and other things stop) till I press any key or move the mouse. Once the keyboard froze for 5 minutes (gkrelmn showed activity), and suddenly started working again, after 5 minutes, just as I had booted up a second computer and I was ready to try sshing (didn't have to). I tell all these things, but I don't know if they are related. I have also tried recompiling the kernel. One possibility is that all the time servers I use go bad and miss-set my clock (ntp goes to LOCAL(0)). Another is that the kernel clock doesn't respond well. I already have "logconfig =all" in "/etc/ntp.conf" and: KERNEL_LOGLEVEL=7 so I don't know what to activate to get more info about the problem. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=344356
Mark Gordon
https://bugzilla.novell.com/show_bug.cgi?id=344356#c1
Michael Skibbe
https://bugzilla.novell.com/show_bug.cgi?id=344356#c2
Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356#c3
--- Comment #3 from Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356#c4
--- Comment #4 from Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356#c5
--- Comment #5 from Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356#c6
--- Comment #6 from Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356#c7
Michael Skibbe
https://bugzilla.novell.com/show_bug.cgi?id=344356#c8
Carlos Robinson
to comment #2:
why you have following entry? server pool.ntp.org server pool.ntp.org server pool.ntp.org server pool.ntp.org
Why not? The idea is not mine; I read it somewhere. Each time you query for a server in the pool, it gives a different server: cer@nimrodel:~> host pool.ntp.org pool.ntp.org has address 84.88.69.10 pool.ntp.org has address 213.194.159.3 pool.ntp.org has address 195.55.174.243 cer@nimrodel:~> host pool.ntp.org pool.ntp.org has address 195.55.174.243 pool.ntp.org has address 84.88.69.10 pool.ntp.org has address 213.194.159.3 or: cer@nimrodel:~> host es.pool.ntp.org es.pool.ntp.org has address 195.10.6.126 es.pool.ntp.org has address 80.59.234.233 es.pool.ntp.org has address 80.33.117.152 cer@nimrodel:~> host es.pool.ntp.org es.pool.ntp.org has address 80.33.117.152 es.pool.ntp.org has address 195.10.6.126 es.pool.ntp.org has address 80.59.234.233 The idea is to give the program a lot of servers for it to find which are the best and use them. I have seen in the past that some of the servers in the pool are pretty bad, so I list more servers so it can find a bunch of good ones.
found somewhere in comment #7:
[...] 29 Nov 00:31:30 ntpd[19587]: synchronized to LOCAL(0), stratum 10 [...]
and
[...] remote local st poll reach delay offset disp ======================================================================= =192.33.214.57 192.168.1.12 2 1024 377 0.13612 295.28592 0.13649 *LOCAL(0) 127.0.0.1 10 64 377 0.00000 0.000000 0.03065 [...]
looks like your hardware clock is broken. but ntp knows that the time difference between server and local is to huge and sync to the bad local. when you restart ntpdate is called which forces the time correction so ntp works fine until the time diffs to much...
something in the system is wrong but it doesn't looks like ntp is the problem.
Remember that this doesn't happen always, not every day. Once it happened right while I was working; I think the machine stopped for 5 seconds while scanning a photo via USB and the clock started to go slow. Or it can be correct for days and days... in a month it has happened 6 times: cer@nimrodel:~> grep "exceeds sanity limit" /var/log/ntp 4 Nov 15:50:18 ntpd[5076]: time correction of 1190 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. 12 Nov 09:37:12 ntpd[5061]: 14 Nov 20:42:37 ntpd[5143]: 21 Nov 23:38:27 ntpd[5170]: 27 Nov 16:22:22 ntpd[12905]:
you told this happens if you are at lunch. do you have a program which starts to run if the computer is idle?
The screen saver. I'll be careful to leave it on during lunch to see if it consistently happens then. If that is consistent, I can disable the screen saver - but I can't believe a user's space program can interfere with kernel clock.
And yes, my guess is that ntp is not broken and that there is something wrong with the kernel maintained clock, but I have no idea what, or even what to do to see it. Maybe the kernel chooses the wrong type of clock during boot. It certainly started to go bad the following day to upgrading to 10.3. boot log extract: cer@nimrodel:~> grep -i clock /var/log/boot.msg <6>Time: tsc clocksource has been installed. <6>Real Time Clock Driver v1.12ac <6>Time: acpi_pm clocksource has been installed. <6>intel8x0_measure_ac97_clock: measured 50655 usecs <6>intel8x0: clocking to 48000 doneSetting up the hardware clockdone When you say "looks like your hardware clock is broken" - I don't suppose you mean the CMOS hardware clock? That does not affect after the system is up, as the time is by them correctly set up and the hardware clock is not consulted any more till next boot. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=344356
User robin.listas@telefonica.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c9
--- Comment #9 from Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356
User robin.listas@telefonica.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c10
--- Comment #10 from Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356
User mskibbe@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c11
Michael Skibbe
https://bugzilla.novell.com/show_bug.cgi?id=344356
User robin.listas@telefonica.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c12
--- Comment #12 from Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356
User martin.burnicki@meinberg.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c13
--- Comment #13 from Martin Burnicki
seems your hardware clock isn't working correct => invalid
From what I've read here and in the mailing list I'm also pretty sure this is a kernel problem. Maybe the latest kernel changes have broken the kernel's support for this particular hardware.
Anyway, where do you see a clue that the hardware might be faulty? Martin -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=344356
User mskibbe@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c14
--- Comment #14 from Michael Skibbe
https://bugzilla.novell.com/show_bug.cgi?id=344356
User robin.listas@telefonica.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c15
--- Comment #15 from Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356
User mskibbe@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c16
Michael Skibbe
https://bugzilla.novell.com/show_bug.cgi?id=344356
User mskibbe@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c17
Michael Skibbe
https://bugzilla.novell.com/show_bug.cgi?id=344356
User robin.listas@telefonica.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c18
--- Comment #18 from Carlos Robinson
https://bugzilla.novell.com/show_bug.cgi?id=344356
User martin.burnicki@meinberg.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c19
--- Comment #19 from Martin Burnicki
https://bugzilla.novell.com/show_bug.cgi?id=344356
User chrubis@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=344356#c20
Cyril Hrubis
participants (1)
-
bugzilla_noreply@novell.com