[Bug 218726] New: NTP time update problems
https://bugzilla.novell.com/show_bug.cgi?id=218726 Summary: NTP time update problems Product: SUSE Linux 10.1 Version: Final Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: misc@dstoecker.de QAContact: qa@suse.de Hello, I have a problem which is there in Suse Linux for at least 3 or 4 major version now and I do not know how to fix it. I use the NTP daemon with ptbtime1.ptb.de and ptbtime2.ptb.de to continously fix the system time. It is started on bootup and also sets the time on bootup. Nevertheless it is possible, that system time drifts away multiple seconds during long run time. Shouldn't NTP correct this? A "rcntp stop && rcntp start" fixes the time again, but is no solution. Something is wrong either with NTP or the start scripts or configuration settings. Non-comment lines of /etc/sysconfig/ntp: NTPD_INITIAL_NTPDATE="ptbtime1.ptb.de ptbtime2.ptb.de" NTPD_OPTIONS="-u ntp" NTPD_RUN_CHROOTED="yes" NTPD_CHROOT_FILES="" NTPD_ADJUST_CMOS_CLOCK="no" NTP_PARSE_LINK="" NTP_PARSE_DEVICE="" -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 mhorvath@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |misc@dstoecker.de ------- Comment #1 from mhorvath@novell.com 2006-11-10 06:53 MST ------- Could you attach also the /var/log/ntp , please? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #2 from misc@dstoecker.de 2006-11-13 00:51 MST ------- I will do as soon as such a time difference occured again, so that I can give additional information. I had a look at the current logs and they were not very helpful. As it is a long-time bug it can wait a bit longer :-) Leaving as NEEDINFO. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 misc@dstoecker.de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|misc@dstoecker.de | ------- Comment #3 from misc@dstoecker.de 2006-11-22 01:55 MST ------- Here the requested data. Additional note: We have a DSL 2MBit flatrate and thus nearly 100% net access. /var/log/ntp: 13 Nov 09:16:24 ntpd[18100]: ntpd exiting on signal 15 13 Nov 09:22:05 ntpd[3239]: synchronized to LOCAL(0), stratum 10 13 Nov 09:22:05 ntpd[3239]: kernel time sync disabled 0041 13 Nov 09:23:10 ntpd[3239]: kernel time sync enabled 0001 --> Time DCF77: 09:49:00 output date command: Mi Nov 22 09:48:42 CET 2006 --> rcntp restart 22 Nov 09:49:06 ntpd[3239]: ntpd exiting on signal 15 --> Time DCF77: 09:50:00 output date command: Mi Nov 22 09:50:00 CET 2006 Additional the ntp.conf (except comment lines) server 127.127.1.0 # local clock (LCL) fudge 127.127.1.0 stratum 10 # LCL is unsynchronized driftfile /var/lib/ntp/drift/ntp.drift # path for drift file logfile /var/log/ntp # alternate log file server ptbtime1.ptb.de server ptbtime2.ptb.de restrict default noserve -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 mhorvath@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team- |mskibbe@novell.com |screening@forge.provo.novell| |.com | Severity|Normal |Major Status|ASSIGNED |NEW -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 mskibbe@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |misc@dstoecker.de ------- Comment #4 from mskibbe@novell.com 2006-11-23 03:56 MST ------- please attach the /var/log/messages, too. which version of xntp and kernel do you use? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #5 from misc@dstoecker.de 2006-11-23 04:16 MST ------- Created an attachment (id=106699) --> (https://bugzilla.novell.com/attachment.cgi?id=106699&action=view) /var/log/messages thinned out -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #6 from misc@dstoecker.de 2006-11-23 04:19 MST ------- As said, I have that problem for a long time now (2 years at least) and I think it is there at least since Suse 9.3. A reduced /var/log/messages is attached using "grep -v -e ddclient -e cron -e sshd -e smbd -e su: -e nmbd". I don't think you need these and these lines contained to much information which I don't want to see public. uname -a on the system with the above log: Linux daneel 2.6.18.1-1-default #1 SMP Tue Oct 24 15:50:18 UTC 2006 i686 athlon i386 GNU/Linux RPM is xntp-4.2.0a-70.4 uname -a on our server (Suse 10.0): Linux odin 2.6.11.4-21.12-smp #1 SMP Wed May 10 09:38:20 UTC 2006 i686 i686 i386 GNU/Linux RPm is xntp-4.2.0a-35 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 misc@dstoecker.de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|misc@dstoecker.de | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #7 from mskibbe@novell.com 2006-11-23 04:31 MST ------- if you do not want to see this informations public send me the reduced log file as email. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #8 from mskibbe@novell.com 2006-11-23 04:32 MST ------- sry overread the attachment. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 mskibbe@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |misc@dstoecker.de ------- Comment #9 from mskibbe@novell.com 2006-11-23 05:51 MST ------- do you have a lot of traffic on your ethernet? can you please provide the output of "ntpq -c pe"? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 misc@dstoecker.de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|misc@dstoecker.de | ------- Comment #10 from misc@dstoecker.de 2006-11-23 06:13 MST ------- For my desktop system the traffic is high due to DSL limitations and the realtime data we need. The other system is a Strato server with medium traffic I would say (low traffic for a server). The below outputs have been made with average network conditions. remote refid st t when poll reach delay offset jitter ============================================================================== *LOCAL(0) LOCAL(0) 10 l 25 64 377 0.000 0.000 0.001 ptbtime1.ptb.de .INIT. 16 u - 1024 0 0.000 0.000 4000.00 ptbtime2.ptb.de .INIT. 16 u - 1024 0 0.000 0.000 4000.00 Strato server: remote refid st t when poll reach delay offset jitter ============================================================================== *LOCAL(0) LOCAL(0) 10 l 32 64 377 0.000 0.000 0.004 dhcp30.rl.b.rz- .INIT. 16 u - 1024 0 0.000 0.000 4000.00 dhcp20.rl.b.rz- .INIT. 16 u - 1024 0 0.000 0.000 4000.00 ptbtime1.ptb.de .INIT. 16 u - 1024 0 0.000 0.000 4000.00 ptbtime2.ptb.de .INIT. 16 u - 1024 0 0.000 0.000 4000.00 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 mskibbe@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |misc@dstoecker.de ------- Comment #11 from mskibbe@novell.com 2006-11-23 06:29 MST ------- uff - your jitter is to high!i have values around 50 on my home server (dsl 2Mbit). the servers can't reached. it is possible that the algorythm can't work correctly because your computers are to slow (because your connection is to slow). an other problem can be that the ntp starts before the network is up. # ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== LOCAL(0) LOCAL(0) 10 l 61 64 3 0.000 0.000 0.004 time.novell.com .GPS. 1 u 58 64 3 220.305 -1.219 51.937 ptbtime1.ptb.de .PTB. 1 u 60 64 3 59.281 -0.844 44.362 ptbtime2.ptb.de .PTB. 1 u 59 64 3 60.462 -1.408 49.408 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 misc@dstoecker.de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|misc@dstoecker.de | ------- Comment #12 from misc@dstoecker.de 2006-11-23 06:50 MST ------- This must have another reason. It can't be that our office with a 2MB DSL line has the same values as a 100MB (maybe 1GB?) connected Server. Especially as the first two NTP servers on our server are in their local ethernet. I would think 4000.0 is something like a marker for "highest value" and the NTP updating is not working at all. I wanted to test on two other server setups I have, but for the Strato Virtual server I get "ntpd[25182]: cap_set_proc() failed to drop root privileges: Operation not permitted" on xntp start (SuSE 9.3) which may be a virtualisation problem and on the other server (SuSE 10.1 most current packages, no firewall) I get bill:~ # rcntp status Checking for network time protocol daemon (NTPD): running bill:~ # ntpq -c pe localhost: timed out, nothing received ***Request timed out which is of no help either. Any option/output/log top turn own, which can bring light into the dark? Related to NTP start before network is up: a) I don't think so. b) Usually the initial time setting works. c) rcxntp (rcntp for 10.1) start and stop later on does not differ from the current behaviour. P.S. What means the "*" in my outputs. It is not there in yours. Also my servers are at "INIT". -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 mskibbe@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |misc@dstoecker.de ------- Comment #13 from mskibbe@novell.com 2006-11-24 02:39 MST ------- this "INIT" means that they are in intialisation mode. you can see that i have in some minutes reached the servers 3 times ("reach" collumn). you never reached a server so the servers are in "init" state. the only time source you can reach is the local time clock (377 times). in my opinion you do not reach any server. can you ping this servers and can you connect to the depending port (afaik 33:udp)? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 misc@dstoecker.de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|misc@dstoecker.de | ------- Comment #14 from misc@dstoecker.de 2006-11-24 03:08 MST ------- a) ping (office machine): PING ptbtime1.ptb.de (192.53.103.108) 56(84) bytes of data. 64 bytes from ptbtime1.ptb.de (192.53.103.108): icmp_seq=1 ttl=54 time=60.1 ms 64 bytes from ptbtime1.ptb.de (192.53.103.108): icmp_seq=2 ttl=54 time=64.6 ms 64 bytes from ptbtime1.ptb.de (192.53.103.108): icmp_seq=3 ttl=54 time=58.4 ms b) Reach: Initial time updating works (but I think this is using TCP?) c) I do not know how to test if UDP connection works or not (port is 123). Nevertheless I did following test: - rcntp stop - start ethereal, log UDP 123 packets (log attached) - rcntp start - ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== LOCAL(0) LOCAL(0) 10 l 15 64 7 0.000 0.000 0.001 ptbtime1.ptb.de .INIT. 16 u - 64 0 0.000 0.000 4000.00 ptbtime2.ptb.de .INIT. 16 u - 64 0 0.000 0.000 4000.00 - I checked my firewall rules and they permit 123 UDP in ingoing and outgoing direction. I nevertheless disabled the firewall completely for this test. The server has no firewall at all (but 100% proper configuration :-) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #15 from misc@dstoecker.de 2006-11-24 03:10 MST ------- Created an attachment (id=106806) --> (https://bugzilla.novell.com/attachment.cgi?id=106806&action=view) ethereal output of port 123 UDP IO -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 mskibbe@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |misc@dstoecker.de ------- Comment #16 from mskibbe@novell.com 2006-11-24 04:10 MST ------- i think the problem is in package 6(,8,10...): Peer Clock Stratum: unspecified or unavailable (0) i diff a ethereal output from me with yours and this package should be: Frame 6 (90 bytes on wire, 90 bytes captured) Ethernet II, Src: Intel_97:4e:3a (00:03:47:97:4e:3a), Dst: AsustekC_b5:3d:3b (00:15:f2:b5:3d:3b) Internet Protocol, Src: 192.53.103.104 (192.53.103.104), Dst: 10.10.2.148 (10.10.2.148) User Datagram Protocol, Src Port: ntp (123), Dst Port: ntp (123) Network Time Protocol Flags: 0x24 Peer Clock Stratum: primary reference (1) Peer Polling Interval: 4 (16 sec) Peer Clock Precision: 0.000001 sec Root Delay: 0.0000 sec Clock Dispersion: 0.0011 sec Reference Clock ID: PTB (Germany) modem service Reference Clock Update Time: Nov 24, 2006 10:44:34.7954 UTC Originate Time Stamp: Nov 24, 2006 10:44:45.7961 UTC Receive Time Stamp: Nov 24, 2006 10:44:45.8212 UTC Transmit Time Stamp: Nov 24, 2006 10:44:45.8212 UTC can you please provide me the content of your drift file? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #18 from misc@dstoecker.de 2006-11-24 08:52 MST ------- An additional note. Today one of our servers (the one, where ntpq -c pe cannot connect) reached a time difference which caused our GPS processing software to stop its work. The severity major gets more and more important. This server is an up-to-date SuSE 10.1 system setup about 14 days ago. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #19 from mskibbe@novell.com 2006-11-27 06:42 MST ------- i tried to reproduce this here but i have no problems. is it possible that your hardware clock is broken? it's very curious that the hardware clock diff around some minutes. in my eyes it looks like a hardware problem but i try to analyze the code, which calculate the broken stratum value. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #20 from misc@dstoecker.de 2006-11-27 07:28 MST ------- A hardware breakage on 4 different computers. No, I don't think. Maybe a configuration issue. But all the machines have very different configurations (including different kernels, different SUSE versions, ...) and I cannot see a common factor. A status summary (of some easy accessible machines): Our Office: 2 Laptops (SUSE 10.1) - works 2 Desktop machines (SUSE 10.1) - failing 1 Server (SUSE 10.0) - works (same hardware as desktop machines) Strato hosting: 1 Root-Server (SUSE 10.0) - failing 1 Root-Server (SUSE 10.1) - failing (ntpq -p does not work at all) 1 VServer (SUSE 9.3) - fails due to rights problem (virt. bug?) The working server in our office: enamun:~ # more /etc/SuSE-release SUSE LINUX 10.0 (i586) OSS VERSION = 10.0 enamun:~ # ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== LOCAL(0) LOCAL(0) 10 l 57 64 377 0.000 0.000 0.001 *ptbtime1.ptb.de .PTB. 1 u 55 1024 377 61.511 -9.641 33.404 +ptbtime2.ptb.de .PTB. 1 u 90 1024 377 67.188 -1.999 0.090 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #21 from mskibbe@novell.com 2006-11-28 02:16 MST ------- (In reply to comment #20)
A hardware breakage on 4 different computers. No, I don't think. Maybe a configuration issue. But all the machines have very different configurations (including different kernels, different SUSE versions, ...) and I cannot see a common factor.
is the mainboard battery low? that's the only thing which could happen on all the computers. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 mskibbe@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |NEEDINFO Info Provider| |misc@dstoecker.de -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 misc@dstoecker.de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|misc@dstoecker.de | ------- Comment #24 from misc@dstoecker.de 2006-11-28 03:29 MST ------- (I will do comment #23, this is a reply to #22) Please don't give up so fast. Its a major problem for us when our GPS processing servers have incorrect time. That others don't report or you cannot reproduce it does not mean anything, as like showed in #20 it surely depends on some not yet known condition and the server time is noncritical for most applications (and thus most wont notice problems). You at least know more than me about the topic, so you can give advice how to proceed. There are surely logs to enable or debug output to produce, which can shed light on this issue. At least until we know if it is a system configuration or an NTP issue. For a test I disabled the usage of local clock in NTP and still have the same results. I started the xntpd by hand using "xntpd -d -D9" (with local clock disabled) and got following (for me this looks as if NTP never gets the NTP packets from the other side, although they arrive). Probably the networking code has some side effects? Maybe getting EINTR is not handled correctly (or at all). Debug1: 9 -> 9 = 9 ntpd 4.2.0a@1.1196-r Thu Jun 29 18:00:55 UTC 2006 (1) Debug1: 9 -> 9 = 9 addto_syslog: ntpd 4.2.0a@1.1196-r Thu Jun 29 18:00:55 UTC 2006 (1) adding new filegen adding new filegen adding new filegen adding new filegen adding new filegen adding new filegen addto_syslog: set_process_priority: Leave priority alone: priority_done is <2> addto_syslog: precision = 1.000 usec create_sockets(123) address_okay: listen Virtual: 1, IF name: lo, Up Flag: 1 address_okay: listen Virtual: 1, IF name: eth0, Up Flag: 1 bind() fd 5, family 2, port 123, addr 0.0.0.0, flags=8 flags for fd 5: 04002 addto_syslog: Listening on interface wildcard, 0.0.0.0#123 bind() fd 6, family 10, port 123, addr ::, flags=0 flags for fd 6: 04002 addto_syslog: Listening on interface wildcard, ::#123 bind() fd 7, family 2, port 123, addr 127.0.0.1, flags=0 flags for fd 7: 04002 addto_syslog: Listening on interface lo, 127.0.0.1#123 bind() fd 8, family 2, port 123, addr 192.168.1.102, flags=8 flags for fd 8: 04002 addto_syslog: Listening on interface eth0, 192.168.1.102#123 create_sockets: ninterfaces=4 interface 0: fd=5, bfd=-1, name=wildcard, flags=0x8, scope=0 sin=0.0.0.0 bcast=0.0.0.0, mask=255.255.255.255 interface 1: fd=6, bfd=-1, name=wildcard, flags=0x0, scope=0 sin=:: mask=ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff interface 2: fd=7, bfd=-1, name=lo, flags=0x5, scope=0 sin=127.0.0.1 mask=255.0.0.0 interface 3: fd=8, bfd=-1, name=eth0, flags=0x9, scope=0 sin=192.168.1.102 bcast=192.168.1.255, mask=255.255.255.0 init_io: maxactivefd 8 local_clock: time 0 clock 0.000000 offset 0.000000 freq 0.000 state 0 Debug2: 9 -> 9 = 9 getaddrinfo ptbtime1.ptb.de getnetnum given ptbtime1.ptb.de, got 192.53.103.108 key_expire: at 0 peer_clear: at 0 assoc ID 34932 refid INIT newpeer: 192.168.1.102->192.53.103.108 mode 3 vers 4 poll 6 10 flags 0x1 0x1 ttl 0 key 00000000 getaddrinfo ptbtime2.ptb.de getnetnum given ptbtime2.ptb.de, got 192.53.103.104 key_expire: at 0 peer_clear: at 0 assoc ID 34933 refid INIT newpeer: 192.168.1.102->192.53.103.104 mode 3 vers 4 poll 6 10 flags 0x1 0x1 ttl 0 key 00000000 authtrust: keyid 0000ffff life 1 report_event: system event 'event_restart' (0x01) status 'sync_alarm, sync_unspec, 1 event, event_unspec' (0xc010) getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here MCAST *****sendpkt(fd=8 dst=192.53.103.108, src=192.168.1.102, ttl=0, len=48) transmit: at 1 192.168.1.102->192.53.103.108 mode 3 poll_update: at 1 192.53.103.108 flags 0001 poll 6 burst 0 last 1 next 67 auth_agekeys: at 1 keys 1 expired 0 timer: refresh ts 0 getrecvbufs called, no action here input_handler: if=3 fd=8 length 48 from c035676c 192.53.103.108 addto_syslog: input_handler: Processed a gob of fd's in 0.103000 msec getrecvbufs returning 1 buffers receive: at 1 192.168.1.102<-192.53.103.108 restrict 002 getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here MCAST *****sendpkt(fd=8 dst=192.53.103.104, src=192.168.1.102, ttl=0, len=48) transmit: at 2 192.168.1.102->192.53.103.104 mode 3 poll_update: at 2 192.53.103.104 flags 0001 poll 6 burst 0 last 2 next 67 getrecvbufs called, no action here input_handler: if=3 fd=8 length 48 from c0356768 192.53.103.104 addto_syslog: input_handler: Processed a gob of fd's in 0.078000 msec getrecvbufs returning 1 buffers receive: at 2 192.168.1.102<-192.53.103.104 restrict 002 getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: select(): nfound=-1, error: Interrupted system call getrecvbufs called, no action here getrecvbufs called, no action here addto_syslog: ntpd exiting on signal 2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #25 from mskibbe@novell.com 2006-11-28 04:33 MST ------- (In reply to comment #24)
(and thus most wont notice problems).
no. there are enough compnies with enough high critacal time depend software. but this doesn't matter. it's important to find the problem but in my opinion i found the problem: your hw clock - i cannot say more thats the cause i closed the bug with _WORKSFORME_ which mean you should find the error and reopen the bug. but let us search the problem together.
For a test I disabled the usage of local clock in NTP and still have the same results.
I started the xntpd by hand using "xntpd -d -D9" (with local clock disabled) and got following (for me this looks as if NTP never gets the NTP packets from the other side, although they arrive). Probably the networking code has some side effects? Maybe getting EINTR is not handled correctly (or at all).
[...] good idea but next time attach the output in an attachment not in the text. i take a look on this output. that do you mean with EINTR? p.s. comment #20 - you see in the output that ntp could reached servers -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #26 from mskibbe@novell.com 2006-11-28 05:27 MST ------- please try the ntp package http://beta.suse.com/private/mskibbe/xntp-4.2.2p4-5.1/. it is the newest version build for 10.1-i386. (ready for download in around 20 minutes) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 mskibbe@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |misc@dstoecker.de -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #27 from misc@dstoecker.de 2006-11-28 05:51 MST ------- Regarding comment #23 I did following: for (( ;; )) do date && rcntp start && date && hwclock -r && hwclock -w && date && hwclock -r && rcntp stop && date && echo "Finished" && sleep 300; done Di Nov 28 11:53:17 CET 2006 Try to get initial date and time via NTP from ptbtime1.ptb.de ptbtime2done.de Starting network time protocol daemon (NTPD) done Di Nov 28 11:53:18 CET 2006 Di 28 Nov 2006 10:53:20 CET -1.013402 Sekunden Di Nov 28 11:53:21 CET 2006 Di 28 Nov 2006 11:53:22 CET -0.494574 Sekunden Shutting down network time protocol daemon (NTPD) done Di Nov 28 11:53:21 CET 2006 Finished Di Nov 28 11:58:21 CET 2006 Try to get initial date and time via NTP from ptbtime1.ptb.de ptbtime2done.de Starting network time protocol daemon (NTPD) done Di Nov 28 11:58:22 CET 2006 Di 28 Nov 2006 11:58:23 CET -0.129229 Sekunden Di Nov 28 11:58:24 CET 2006 Di 28 Nov 2006 11:58:25 CET -0.492354 Sekunden Shutting down network time protocol daemon (NTPD) done Di Nov 28 11:58:24 CET 2006 Finished Di Nov 28 12:03:24 CET 2006 Try to get initial date and time via NTP from ptbtime1.ptb.de ptbtime2done.de Starting network time protocol daemon (NTPD) done Di Nov 28 12:03:25 CET 2006 Di 28 Nov 2006 12:03:26 CET -0.214876 Sekunden Di Nov 28 12:03:27 CET 2006 Di 28 Nov 2006 12:03:28 CET -0.492774 Sekunden Shutting down network time protocol daemon (NTPD) done Di Nov 28 12:03:27 CET 2006 Finished Di Nov 28 12:08:27 CET 2006 Try to get initial date and time via NTP from ptbtime1.ptb.de ptbtime2done.de Starting network time protocol daemon (NTPD) done Di Nov 28 12:08:28 CET 2006 Di 28 Nov 2006 12:08:29 CET -0.191766 Sekunden Di Nov 28 12:08:30 CET 2006 Di 28 Nov 2006 12:08:31 CET -0.494115 Sekunden Shutting down network time protocol daemon (NTPD) done Di Nov 28 12:08:30 CET 2006 Finished # NTP disabled # date && hwclock -r Di Nov 28 12:13:30 CET 2006 Di 28 Nov 2006 12:13:31 CET -0.468582 Sekunden # date && hwclock -r Di Nov 28 12:13:45 CET 2006 Di 28 Nov 2006 12:13:46 CET -0.423638 Sekunden # date && hwclock -r Di Nov 28 13:30:30 CET 2006 Di 28 Nov 2006 13:30:31 CET -0.233519 Sekunden I don't see big problems in the hwclock (which is btw disabled in the NTP config now). I also deleted /etc/adjtime and the ntp.conf file before doing these tests. Regarding comment #25: No, the output where the server could be reached is from the computer which has equal hardware, Suse 10.0, and which works. /usr/include/asm-generic/errno-base.h:#define EINTR 4 /* Interrupted system call */ EINTR is one of the system signals, which seldomly break system calls like IPC or socket handling. I had trouble with it in our software sometimes. Writing daemon software handling hundreds of TCP connections teaches you most of the problems of operating system signals. A note: Maybe one difference is following: Our servers have a constant networking load with open sockets sending permanent data (not much data, but permanent). E.g. web servers have a different characteristics (many open/close connections). Ideally our network connections having living times for days or weeks. Regarding #26: I'll do. I post the results when done. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #28 from misc@dstoecker.de 2006-11-28 08:08 MST ------- Created an attachment (id=107218) --> (https://bugzilla.novell.com/attachment.cgi?id=107218&action=view) Output of the failure office system -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 ------- Comment #29 from misc@dstoecker.de 2006-11-28 08:08 MST ------- Created an attachment (id=107219) --> (https://bugzilla.novell.com/attachment.cgi?id=107219&action=view) Output of the failure working laptop office system -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=218726 misc@dstoecker.de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED Info Provider|misc@dstoecker.de | Resolution| |FIXED ------- Comment #30 from misc@dstoecker.de 2006-11-28 08:23 MST ------- First comment to explain the 2 added files: New version didn't seem to change much - Except ntpq now says 0.0 instead of 4000.0 (but still in INIT state). The server, where ntpq -p does timeout also still does timeout. remote refid st t when poll reach delay offset jitter ============================================================================== LOCAL(0) .LOCL. 10 l 54 64 3 0.000 0.000 0.001 ptbtime1.ptb.de .INIT. 16 u - 64 0 0.000 0.000 0.000 ptbtime2.ptb.de .INIT. 16 u - 64 0 0.000 0.000 0.000 I attached two files, which are "xntpd -d -D9" on the desktop system with the error and the laptop system which works. They both have SuSE10.1, the new xntp package and the same ntp.conf. Some time later: A little note: After removing the "restrict default noserve" from the /etc/ntp.conf it looks like remote refid st t when poll reach delay offset jitter ============================================================================== LOCAL(0) .LOCL. 10 l 36 64 17 0.000 0.000 0.001 +ptbtime1.ptb.de .PTB. 1 u 37 64 17 61.999 0.730 12.179 *ptbtime2.ptb.de .PTB. 1 u 37 64 17 63.101 7.009 14.791 Seems it works now? I will have a look at it. Was in the first line an error of "restrict default noserve"? Why does ntpq -p not work with this restrict and does display .INIT. on another one, but only for non local clock? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
participants (1)
-
bugzilla_noreply@novell.com