[ntp:questions] high precision tracking: trying to understand sudden jumps

Discussion:

[ntp:questions] high precision tracking: trying to understand sudden jumps

starlight

2008-03-30 17:09:43 UTC

Hello,

I'm trying to configure a small network for high precision time.
Recently acquired an Endrun CDMA time server that runs like
a dream, tracking CDMA time to about +/- 5 microseconds.

The clients are a rag-tag assembly of diverse systems including
a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.

All are configured to prefer the Endrun clock and poll it on a
16 second interval. All are attached to a single SMC gigabit
Ethernet switch with only the Endrun and two Sun systems running
at a lower speed of 100 MBPS. Close to zero network traffic
and system loads.

All systems are running 'ntpd' 4.2.4p4. Compiled NTP native
64-bit for the Windows X64 system. [A #ifdef tweak to
'intptr_t' and 'uintptr_t' is required, will provide patch if
desired].

It generally is working well, with the systems tracking anywhere
from +/- 100 microseconds to +/- 500 microseconds most of the
time.

However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds. Had an issue where a UPS voltage correction shift
and cheap power supply on the Windows X64 box appeared to be a
problem, but that was fixed by configuring the UPS to consider
110V nominal instead of 120V.

Does anyone have any ideas about what could be causing these
random time jumps and what might be done to eliminate them?

Something I'm planning to try is to make sure that 'mlock' is
configured in the daemons--presently 'autoconf' has left it
disabled for some reason. However I don't belive page
faults are the culprit. All the daemons are running at
the highest real-time priority in the respective systems.

The above configuration is a controlled lab setup. The next
target is a stack eight of DELL 1950 servers in a production
data center running Windows 2003 R2 and slaved to a newer Endrun
time server. Don't have useful data from these systems yet
because the network jitter is outrageous. Working with the
network admin to hopefully have the NTP traffic to and from the
Endrun clock bypass level 3 switch/router rule checking. They
have large, complex router ACL rulesets I suspect as the cause
of the jitter.

Attached are fairly representative graphs of the offset and
frequency for two of the lab servers.

Thanks

P.S. Resent without graphs as the list mailer says
they're not allowed. Happy to send them or the raw
'loopstats' to anyone interested.

Richard B. Gilbert

2008-03-30 18:38:55 UTC

Post by starlight
Hello,
I'm trying to configure a small network for high precision time.
Recently acquired an Endrun CDMA time server that runs like
a dream, tracking CDMA time to about +/- 5 microseconds.
The clients are a rag-tag assembly of diverse systems including
a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.
All are configured to prefer the Endrun clock and poll it on a
16 second interval. All are attached to a single SMC gigabit
Ethernet switch with only the Endrun and two Sun systems running
at a lower speed of 100 MBPS. Close to zero network traffic
and system loads.
All systems are running 'ntpd' 4.2.4p4. Compiled NTP native
64-bit for the Windows X64 system. [A #ifdef tweak to
'intptr_t' and 'uintptr_t' is required, will provide patch if
desired].
It generally is working well, with the systems tracking anywhere
from +/- 100 microseconds to +/- 500 microseconds most of the
time.
However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds.

<snip>

Forcing the poll interval to 16 seconds is not always a good idea!
Ntpd will select a poll interval, generally starting at 64 seconds, and
ramping up to as long as 1024 seconds as the clock is beaten into
submission!

Directly connected refclocks are frequently polled at shorter intervals
but I don't think your refclock is "directly connected" in the same
sense that a clock working through a serial or parallel port is directly
connected!

A clock connected via ethernet with all the latencies and jitter
thereunto appertaining is no different than any other network server and
should be polled in the same manner!

The very short poll intervals correct large errors quickly and the very
long intervals correct small errors very accurately!

Unruh

2008-03-30 21:39:01 UTC

Post by Richard B. Gilbert

Post by starlight
Hello,
I'm trying to configure a small network for high precision time.
Recently acquired an Endrun CDMA time server that runs like
a dream, tracking CDMA time to about +/- 5 microseconds.
The clients are a rag-tag assembly of diverse systems including
a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.
All are configured to prefer the Endrun clock and poll it on a
16 second interval. All are attached to a single SMC gigabit
Ethernet switch with only the Endrun and two Sun systems running
at a lower speed of 100 MBPS. Close to zero network traffic
and system loads.
All systems are running 'ntpd' 4.2.4p4. Compiled NTP native
64-bit for the Windows X64 system. [A #ifdef tweak to
'intptr_t' and 'uintptr_t' is required, will provide patch if
desired].
It generally is working well, with the systems tracking anywhere
from +/- 100 microseconds to +/- 500 microseconds most of the
time.
However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds.

<snip>
Forcing the poll interval to 16 seconds is not always a good idea!
Ntpd will select a poll interval, generally starting at 64 seconds, and
ramping up to as long as 1024 seconds as the clock is beaten into
submission!

It is his network, he is not going to overload it. So, if he wants a 16 sec
poll interval that is up to him.
I agree it is not a good idea for remote servers, but on his own system it
is fine.

Post by Richard B. Gilbert
Directly connected refclocks are frequently polled at shorter intervals
but I don't think your refclock is "directly connected" in the same
sense that a clock working through a serial or parallel port is directly
connected!
A clock connected via ethernet with all the latencies and jitter
thereunto appertaining is no different than any other network server and
should be polled in the same manner!

??? The longer polls are in order not to swamp the remote server whith
10000 people all polling every 16 sec ( or 1 sec) There is nothing in ntp
itself that mandates a longer poll interval. In fact a shorter poll
interval makes ntp much more responsive to changes ( clock drifts, etc)

Post by Richard B. Gilbert
The very short poll intervals correct large errors quickly and the very
long intervals correct small errors very accurately!

No for a properly designed system both should be corrected.

Maarten Wiltink

2008-03-30 22:21:07 UTC

"Unruh" <unruh-spam at physics.ubc.ca> wrote in message

Post by Unruh

Post by Richard B. Gilbert
Forcing the poll interval to 16 seconds is not always a good idea!
Ntpd will select a poll interval, generally starting at 64 seconds,
and ramping up to as long as 1024 seconds as the clock is beaten
into submission!

It is his network, he is not going to overload it. So, if he wants a
16 sec poll interval that is up to him.
I agree it is not a good idea for remote servers, but on his own system
it is fine.

[...]

Post by Unruh
??? The longer polls are in order not to swamp the remote server whith
10000 people all polling every 16 sec ( or 1 sec) There is nothing in
ntp itself that mandates a longer poll interval. In fact a shorter poll
interval makes ntp much more responsive to changes ( clock drifts, etc)

Post by Richard B. Gilbert
The very short poll intervals correct large errors quickly and the
very long intervals correct small errors very accurately!

No for a properly designed system both should be corrected.

You seem to be missing the point. Once the large errors have been
corrected, NTP goes on to the small errors. For that, it _needs_ a
longer poll interval. That this gives the server more air is a
happy coincidence, but not why it does it.

Given the measurement error, you need to let the small error
accumulate over a longer period. Otherwise it would simply be
lost in the noise.

Do the math: assume the (constant!) measurement error to be +/- 1 ms,
the frequency error in my local host to be 1000 PPM (1/1000). With a
1 s polling interval, the real value is 1 ms and the measurement
will be between 0 and 2 ms. Not very good. With a 1000 s polling
interval, the real value is 1 s and the measurement will be between
0.999 and 1.001 s. Now that's useful to correct your clock with.

Now use more realistic numbers, like 50 PPM to start with, a polling
interval of 64 s and I'm not exactly sure what for the measuring
jitter. But the gist should be clear: that 50 PPM will go down, the
SNR will worsen, and the polling interval should go up to improve it
again.

Starting with a short interval is good to correct large errors
quickly. Backing off once you've done so is good to avoid pestering
the server, but it's also good to correct small errors accurately,
and _that_ is why it's done. And of course, once a larger than
expected offset is measured, the polling interval is shortened
again.

Groetjes,
Maarten Wiltink

David Woolley

2008-03-30 23:30:37 UTC

Post by Maarten Wiltink
You seem to be missing the point. Once the large errors have been
corrected, NTP goes on to the small errors. For that, it _needs_ a
longer poll interval. That this gives the server more air is a
happy coincidence, but not why it does it.

I don't believe it *needs* longer poll intervals; I think they are
simply wasteful in that the offsets are low pass filtered in such a way
that clamping maxpoll makes very little difference to the result, when
the time constant goes high.

I'm not sure that there is any user configurable option that actually
does what people think they are doing by locking down maxpoll, in terms
of keeping the loop time constant low.

A clamped maxpoll may improve the reponsiveness to faults causing time
steps of more than 128ms, but one should be attacking the problem, not
the symptom.

Unruh

2008-03-31 04:50:00 UTC

Post by Maarten Wiltink
"Unruh" <unruh-spam at physics.ubc.ca> wrote in message

Post by Unruh

Post by Richard B. Gilbert
Forcing the poll interval to 16 seconds is not always a good idea!
Ntpd will select a poll interval, generally starting at 64 seconds,
and ramping up to as long as 1024 seconds as the clock is beaten
into submission!

It is his network, he is not going to overload it. So, if he wants a
16 sec poll interval that is up to him.
I agree it is not a good idea for remote servers, but on his own system
it is fine.

[...]

Post by Unruh
??? The longer polls are in order not to swamp the remote server whith
10000 people all polling every 16 sec ( or 1 sec) There is nothing in
ntp itself that mandates a longer poll interval. In fact a shorter poll
interval makes ntp much more responsive to changes ( clock drifts, etc)

Post by Richard B. Gilbert
The very short poll intervals correct large errors quickly and the
very long intervals correct small errors very accurately!

No for a properly designed system both should be corrected.

You seem to be missing the point. Once the large errors have been
corrected, NTP goes on to the small errors. For that, it _needs_ a
longer poll interval. That this gives the server more air is a
happy coincidence, but not why it does it.

I have no idea what this means. ntp simply runs a second order feedback
network It does not do anything for "large and small" errors.

Post by Maarten Wiltink
Given the measurement error, you need to let the small error
accumulate over a longer period. Otherwise it would simply be
lost in the noise.

No idea what you mean.

Post by Maarten Wiltink
Do the math: assume the (constant!) measurement error to be +/- 1 ms,
the frequency error in my local host to be 1000 PPM (1/1000). With a
1 s polling interval, the real value is 1 ms and the measurement
will be between 0 and 2 ms. Not very good. With a 1000 s polling
interval, the real value is 1 s and the measurement will be between
0.999 and 1.001 s. Now that's useful to correct your clock with.

You are not talking about large and small errors, you aree talking about
phase and frequency errors. And no computer has fixed eitehr phase of
frequency errors. They keep changing. Thus integrating for a longer time
does not help if the frequency errors ( drift) keeps changing.

Post by Maarten Wiltink
Now use more realistic numbers, like 50 PPM to start with, a polling
interval of 64 s and I'm not exactly sure what for the measuring
jitter. But the gist should be clear: that 50 PPM will go down, the
SNR will worsen, and the polling interval should go up to improve it
again.

??? What you are descibing in one of the key problems with the ntp
algorithm.

Post by Maarten Wiltink
Starting with a short interval is good to correct large errors
quickly. Backing off once you've done so is good to avoid pestering
the server, but it's also good to correct small errors accurately,
and _that_ is why it's done. And of course, once a larger than
expected offset is measured, the polling interval is shortened
again.

Anyway, that is not his problem. He is getting ms spikes in the loopfilter.
Those wipe out anything else he does. It destroys all attempts by ntp to
discipline the clock.

Post by Maarten Wiltink
Groetjes,
Maarten Wiltink

David Woolley

2008-03-31 08:55:54 UTC

Post by Unruh
I have no idea what this means. ntp simply runs a second order feedback
network It does not do anything for "large and small" errors.

See sections G.4, G.5 and following of RFC 1305 (page 95 and onwards in
the PDF version). A couple of parameters are dynmacially adjusted to
change the time constant of the network, so it is not entirely "simple".

Richard B. Gilbert

2008-03-30 22:30:53 UTC

Post by Unruh

Post by Richard B. Gilbert

Post by starlight
Hello,
I'm trying to configure a small network for high precision time.
Recently acquired an Endrun CDMA time server that runs like
a dream, tracking CDMA time to about +/- 5 microseconds.
The clients are a rag-tag assembly of diverse systems including
a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.
All are configured to prefer the Endrun clock and poll it on a
16 second interval. All are attached to a single SMC gigabit
Ethernet switch with only the Endrun and two Sun systems running
at a lower speed of 100 MBPS. Close to zero network traffic
and system loads.
All systems are running 'ntpd' 4.2.4p4. Compiled NTP native
64-bit for the Windows X64 system. [A #ifdef tweak to
'intptr_t' and 'uintptr_t' is required, will provide patch if
desired].
It generally is working well, with the systems tracking anywhere
from +/- 100 microseconds to +/- 500 microseconds most of the
time.
However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds.

<snip>
Forcing the poll interval to 16 seconds is not always a good idea!
Ntpd will select a poll interval, generally starting at 64 seconds, and
ramping up to as long as 1024 seconds as the clock is beaten into
submission!

It is his network, he is not going to overload it. So, if he wants a 16 sec
poll interval that is up to him.
I agree it is not a good idea for remote servers, but on his own system it
is fine.

Post by Richard B. Gilbert
Directly connected refclocks are frequently polled at shorter intervals
but I don't think your refclock is "directly connected" in the same
sense that a clock working through a serial or parallel port is directly
connected!
A clock connected via ethernet with all the latencies and jitter
thereunto appertaining is no different than any other network server and
should be polled in the same manner!

??? The longer polls are in order not to swamp the remote server whith
10000 people all polling every 16 sec ( or 1 sec) There is nothing in ntp
itself that mandates a longer poll interval. In fact a shorter poll
interval makes ntp much more responsive to changes ( clock drifts, etc)

Post by Richard B. Gilbert
The very short poll intervals correct large errors quickly and the very
long intervals correct small errors very accurately!

No for a properly designed system both should be corrected.

If you don't measure across a long interval, you will never see some of
those small errors. When you measure across 1024 seconds you overwhelm
the network jitter. The long interval is part of the design for just
that reason.

Suppose your frequency error is 5 PPM or 0.43 seconds per day. Do you
think you can measure that error accurately with a 64 second poll
interval? If you are working over the internet, an error that small is
going to disappear in the jitter. It will be sixteen times more obvious
at the longer interval.

You can poll a hardware reference clock at 16 second intervals because
the network is not involved! The latency and jitter a PPS signal over a
serial port are an order or two of magnitiude less than what you get
over a busy network.

David Woolley

2008-03-30 22:49:02 UTC

Post by Unruh
10000 people all polling every 16 sec ( or 1 sec) There is nothing in ntp
itself that mandates a longer poll interval. In fact a shorter poll
interval makes ntp much more responsive to changes ( clock drifts, etc)

As I understand it, locking maxpoll low only slightly improves
responsiveness. The main effect is simply to oversample, as the time
constants still adjust to values appropriate to a poll interval of 1024s.

Unruh

2008-03-30 18:57:32 UTC

Post by starlight
Hello,
I'm trying to configure a small network for high precision time.
Recently acquired an Endrun CDMA time server that runs like
a dream, tracking CDMA time to about +/- 5 microseconds.

No idea what CDMa time is, but that does not matter.
Do you have peerstats running on the various machines so you can look at
the raw offset and particularly the round trip times? It may be that your
network one way is suddenly delaying things for mseconds one way for half
an hour say.

Post by starlight
The clients are a rag-tag assembly of diverse systems including
a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.
All are configured to prefer the Endrun clock and poll it on a
16 second interval. All are attached to a single SMC gigabit
Ethernet switch with only the Endrun and two Sun systems running
at a lower speed of 100 MBPS. Close to zero network traffic
and system loads.

Maybe that ethernet switch suffers a nervous breakdown (too little to do?)
once a day.

Post by starlight
All systems are running 'ntpd' 4.2.4p4. Compiled NTP native
64-bit for the Windows X64 system. [A #ifdef tweak to
'intptr_t' and 'uintptr_t' is required, will provide patch if
desired].
It generally is working well, with the systems tracking anywhere
from +/- 100 microseconds to +/- 500 microseconds most of the
time.

Should be within 10s of usec, not hundreds.

Post by starlight
However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds. Had an issue where a UPS voltage correction shift
and cheap power supply on the Windows X64 box appeared to be a
problem, but that was fixed by configuring the UPS to consider
110V nominal instead of 120V.
Does anyone have any ideas about what could be causing these
random time jumps and what might be done to eliminate them?
Something I'm planning to try is to make sure that 'mlock' is
configured in the daemons--presently 'autoconf' has left it
disabled for some reason. However I don't belive page
faults are the culprit. All the daemons are running at
the highest real-time priority in the respective systems.
The above configuration is a controlled lab setup. The next
target is a stack eight of DELL 1950 servers in a production
data center running Windows 2003 R2 and slaved to a newer Endrun
time server. Don't have useful data from these systems yet

I would have just used a cheap GPS receiver, not pay $700 for one of these,
but it's your money.

Ah, just looked at their web page. Would I really believe that the CDMA
cell phone network would care if their time signal were accurate to usec?
There is no time path correction. But you should see that on your server
connected to the device.

Anyway, look at the peerstats file, esp the roundtrip times and the
offsets. The ntp clock-filter tries to compensate for vast variations in
these but can only do so much.

Post by starlight
because the network jitter is outrageous. Working with the
network admin to hopefully have the NTP traffic to and from the
Endrun clock bypass level 3 switch/router rule checking. They
have large, complex router ACL rulesets I suspect as the cause
of the jitter.

Sounds a bit weird. On an ADSL link from home through the telco to the university, I get
better than 1ms time accuracy.

Post by starlight
Attached are fairly representative graphs of the offset and
frequency for two of the lab servers.

Netnews is text only. Post the info on a web page where anyone can look at
it.

Post by starlight
Thanks
P.S. Resent without graphs as the list mailer says
they're not allowed. Happy to send them or the raw
'loopstats' to anyone interested.

Just post them.

Maarten Wiltink

2008-03-30 21:43:07 UTC

"Unruh" <unruh-spam at physics.ubc.ca> wrote in message

[...] Would I really believe that the CDMA cell phone network
would care if their time signal were accurate to usec?

I would. Because IIUC, this is the basis on which they divide
timeslots between stations.

Groetjes,
Maarten Wiltink

David Woolley

2008-03-30 19:10:21 UTC

Post by starlight
The clients are a rag-tag assembly of diverse systems including
a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.

How are you interpolating the 16ms ticks on the Windows system? How are
you disabling power management on the lap top?

Post by starlight
It generally is working well, with the systems tracking anywhere
from +/- 100 microseconds to +/- 500 microseconds most of the
time.

How are you measuring the difference from true time? In principle, if
ntpd can measure it, it will correct it.

Post by starlight
However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds. Had an issue where a UPS voltage correction shift

In which direction is the slip? Backward only slips against true time
(these might appear as forward slips if the real error is in the server)
are typically due to lost clock interrupts. If that is the case it
implies you are using a tick rate of other than 100Hz. Please note that
the Linux kernel code is broken for clock frequencies other than 100Hz
and the use of 1000Hz significantly increases the likelihood of a lost
interrupt.

The normal source of lsot interrupts is disk drivers using programmed
transfers.

Post by starlight
and cheap power supply on the Windows X64 box appeared to be a
problem, but that was fixed by configuring the UPS to consider
110V nominal instead of 120V.

Unruh

2008-03-30 21:43:40 UTC

Post by David Woolley

Post by starlight
The clients are a rag-tag assembly of diverse systems including
a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.

How are you interpolating the 16ms ticks on the Windows system? How are
you disabling power management on the lap top?

Post by starlight
It generally is working well, with the systems tracking anywhere
from +/- 100 microseconds to +/- 500 microseconds most of the
time.

How are you measuring the difference from true time? In principle, if
ntpd can measure it, it will correct it.

I expect that he means the offsets that ntp measures. NTP does NOT correct
random offsets. Ie, if there is noise source which makes the offsets vary
by 500usec ntp will not get rid of them. You will see them in the offsets
as measured by ntp. Now, the time keeping might (or might not) be more
accurate than that, but those offsets are what I suspect he means.

Post by David Woolley

Post by starlight
However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds. Had an issue where a UPS voltage correction shift

In which direction is the slip? Backward only slips against true time
(these might appear as forward slips if the real error is in the server)
are typically due to lost clock interrupts. If that is the case it
implies you are using a tick rate of other than 100Hz. Please note that
the Linux kernel code is broken for clock frequencies other than 100Hz
and the use of 1000Hz significantly increases the likelihood of a lost
interrupt.

He claims on all the systems.

Post by David Woolley
The normal source of lsot interrupts is disk drivers using programmed
transfers.

Almost all disk drives on Linux now use dma.

Post by David Woolley

Post by starlight
and cheap power supply on the Windows X64 box appeared to be a
problem, but that was fixed by configuring the UPS to consider
110V nominal instead of 120V.

David Woolley

2008-03-30 23:00:58 UTC

Post by Unruh
I expect that he means the offsets that ntp measures. NTP does NOT correct

I suspect that too.

Post by Unruh
random offsets. Ie, if there is noise source which makes the offsets vary

It averages them so as to reduce their effective size.

Post by Unruh
by 500usec ntp will not get rid of them. You will see them in the offsets
as measured by ntp. Now, the time keeping might (or might not) be more
accurate than that, but those offsets are what I suspect he means.

The question is about "measured errors" that significantly exceed the
random offsets. In any case the systematic error can also greatly
exceed the measured offset - that represents an error that ntpd cannot
measure.

Post by Unruh
Almost all disk drives on Linux now use dma.

They need to do both and the drivers that caused this problem were
capable of using DMA. The problem was, I believe, that certain chipsets
were unsafe with DMA, so the default, at least used to be, the
unconditional one of doing programmed transfers; you could enable DMA at
your own risk.

My impression is that there are still enough systems with lost disk
interrupts that someone reporting one tick backward steps can reasonably
be assumed to have that problem, and it is a reasonable probability for
someone who doesn't report the direction of the step. The other common
cause of steps, which are balanced in both directions, is not applicable
here.

Hal Murray

2008-03-30 22:47:50 UTC

Post by starlight
However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds.

What does that mean?

I'm guessing that "uncorrelated" means the glitches don't happen
at the same time.

Are all clients seeing occasional problems? Do they match
cron jobs or some activity burst on the system?

Can you try another network switch? Or maybe even run without
any switches? (plug the CDMA box directly into a second ethernet
port)

Can you try another NTP server? How about setting up a PC,
letting it run for a day to establich a good drift file, and
then making it run on the local clock only. That will drift,
slowly, but there won't be any jumps.

How about adding another client that doesn't do anything?
(Turn off cron too.)

--
These are my opinions, not necessarily my employer's. I hate spam.

Bill Unruh

2008-03-31 04:36:09 UTC

Are those on the same day?

Yes, same day. Uncorrelated to anything I can identify
or each other. Same story on all the boxes. Running
a hefty multi-system compile with heavy NFS and Samba
traffic does not produce these events, though it disturbs
the Windows boxes slightly when CPU goes to 100%.

Which "linux" and which "windows" are those graphs since you
have 2 linux and 2 windows clients.

That's the dual-core AMD 2.4GHz Athlon Tyan mobo whitebox
runing Centos 4.5 SMP kernel. Similar results on the
Dell Dimension 2400 2.4GHz Intel P4 running Centos 4.5
mono-processor kernel.
Windows is a dual-core 3.4GHz Pentium D Tyan mobo whitebox
running 2003 R2 SP2 standard server.

As I said, seeing the
peerstats files would be helpful (offset and roundtrip)

Might try them later, but I can't belive a high-quality
SMC switch is causing multi-millisecond delays. Just not
possible. Pings are all about 400 microseconds, consistent
but slightly different on each system. Round trip is
800 microseconds. Attaching the output from a bulk 'ntpq -p'
'ntptrace' script I have below. Note that's 'ntptrace'
version 4.1 since the 4.2 script has useless offset info.

I have had weird latencies on some switches here.
And since all your machines are experiencing this, that switch is the only
commonality (or the ntp server). Do you have the peerstats on the server as
well to make sure that there are not some weird delays there.

Also these graphs seem to have cut off the spikes. Are the
spikes actaully higher or is that an illusion?

Higher. Sometimes 1ms, sometimes 5-6ms.

(Note the spikes are hundreds of usec, not many msec)

That would be the ~1ms example, check out the other one.

I am also really really really disturbed that you have so many servers. You
are trying to test out one specific server. The others are simply liable to
confuse everything. For example ntp could for some bizarre reason, suddenly
decide to use one of those other sites as the preferred server and give a
glitch.

And what are all those CDMA servers? Set your system up with one single
source, the one you want to test.

remote refid st t when poll reach delay offset jitter
==============================================================================
Endrun CDMA
LOCAL(0) LOCAL(0) 10 l 18 64 377 0.000 0.000 0.015
*HOPF_S(0) .CDMA. 0 l 6 16 377 0.000 0.000 0.015
Centos 32
*eachna .CDMA. 1 u 3 16 377 0.683 -0.004 0.009
-tock.usno.navy. .USNO. 1 u 452 1024 377 20.678 1.432 2.822
+navobs1.wustl.e .GPS. 1 u 479 1024 377 50.136 -1.513 0.164
+time.nist.gov .ACTS. 1 u 471 1024 377 66.528 -1.708 0.156
-tick.ucla.edu .GPS. 1 u 432 1024 377 87.372 3.296 0.085
Ultra 10
*172.29.87.3 .CDMA. 1 u 11 16 377 0.869 -0.016 0.042
172.29.87.15: stratum 2, offset -0.000007, synch distance 0.00783
172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
Ultra 80
*172.29.87.3 .CDMA. 1 u 4 16 377 0.942 -0.012 0.012
172.29.87.17: stratum 2, offset -0.000038, synch distance 0.00685
172.29.87.3: stratum 1, offset -0.000017, synch distance 0.00038, refid 'CDMA'
44p
*172.29.87.3 .CDMA. 1 u 13 16 377 0.809 -0.001 0.016
172.29.87.13: stratum 2, offset -0.000014, synch distance 0.00627
172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
Centos 64
*172.29.87.3 .CDMA. 1 u 12 16 377 0.664 0.003 0.487
172.29.87.19: stratum 2, offset -0.000009, synch distance 0.00720
172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
W2K3 64
*172.29.87.3 .CDMA. 1 u 4 16 377 0.734 0.053 0.014
172.29.87.20: stratum 2, offset -0.000060, synch distance 0.00650
172.29.87.3: stratum 1, offset -0.000019, synch distance 0.00038, refid 'CDMA'
XP 32 laptop
*172.29.87.3 .CDMA. 1 u 7 16 377 0.819 0.468 0.256
172.29.87.12: stratum 2, offset -0.000173, synch distance 0.00655
172.29.87.3: stratum 1, offset -0.000017, synch distance 0.00038, refid 'CDMA'

--
William G. Unruh | Canadian Institute for| Tel: +1(604)822-3273
Physics&Astronomy | Advanced Research | Fax: +1(604)822-5324
UBC, Vancouver,BC | Program in Cosmology | unruh at physics.ubc.ca
Canada V6T 1Z1 | and Gravity | www.theory.physics.ubc.ca/

David Woolley

2008-03-31 09:29:13 UTC

You appear to be quoting an off list reply with no indication of
permission, although it is just possible that the email gateway
forwarded it to email subscribers without forwarding it to the usenet
group proper.

Incidentally, what he's done is to run together the peers information
from many machines, so there is only one CDMA source. On the other
hand, it doesn't look like it is a CDMA appliance, or if it is, it has
been badly implemented, as I would not expect to see a local clock
driver on an appliance device.

The delays are rather large for the paragon of perfection of a network
that was described.

He probably needs to be aware that normal applications on the Windows
boxes will see times with a resolution that is rather poorer than can be
seen by ntptrace, as ntptrace takes advantage of the ntpd tick
interpolation, but normal applications will see times with a resolution
of one clock tick.

Steve Kostecke

2008-03-31 11:46:11 UTC

Post by David Woolley
You appear to be quoting an off list reply with no indication of
permission, although it is just possible that the email gateway
forwarded it to email subscribers without forwarding it to the usenet
group proper.

What you are suggesting is not possible.

The Usenet news-group is just another subscriber to the questions list.

--
Steve Kostecke <kostecke at ntp.org>
NTP Public Services Project - http://support.ntp.org/

David Woolley

2008-03-31 12:57:54 UTC

Post by Steve Kostecke

Post by David Woolley
You appear to be quoting an off list reply with no indication of
permission, although it is just possible that the email gateway
forwarded it to email subscribers without forwarding it to the usenet
group proper.

What you are suggesting is not possible.
The Usenet news-group is just another subscriber to the questions list.

It's certainly very possible that the missing article was private email
only, although possibly by mistake. The mailing list doesn't seem to be
a simple subscriber, as an example quoted before showed no sign of
attachments in the usenet version, but the mail archive version that I
was pointed to mentioned that attachments (a PGP signature) had been
suppressed.

I assume you mean the usenet gateway is a subscriber, as usenet groups
can't subscribe to mailing lists on their own. In that case, it is at
least theoretically possible that the gateway suppresses the message on
the usenet side, but if it is an ordinary subscriber on the mailing list
side, the message will still go to other mailing list subscribers. One
obvious case in which this would happen is if there was a duplicate
message ID.

I haven't checked the mail archives, but I did check Google groups, and
it hasn't seen the missing message.

Steve Kostecke

2008-04-01 13:59:38 UTC

On 2008-03-31, David Woolley <david at ex.djwhome.demon.co.uk.invalid>

Post by David Woolley

Post by Steve Kostecke

Post by David Woolley
You appear to be quoting an off list reply with no indication of
permission, although it is just possible that the email gateway
forwarded it to email subscribers without forwarding it to the
usenet group proper.

What you are suggesting is not possible.
The Usenet news-group is just another subscriber to the questions list.

It's certainly very possible that the missing article was private
email only, although possibly by mistake.

Private e-mail can not be a "missing article".

Post by David Woolley
The mailing list doesn't seem to be a simple subscriber,

There is only _one_ type of list subscriber: those who receive mail from
the list.

Post by David Woolley
as an example quoted before showed no sign of attachments in the usenet
version, but the mail archive version that I was pointed to mentioned
that attachments (a PGP signature) had been suppressed.

Our mailing lists strip out all manner of MIME cruft. The gateway is a
bit more stringent to protect those of us who use real (i.e. console)
news readers.

Post by David Woolley
I assume you mean the usenet gateway is a subscriber, as usenet
groups can't subscribe to mailing lists on their own. In that case,
it is at least theoretically possible that the gateway suppresses the
message on the usenet side, but if it is an ordinary subscriber on the
mailing list side, the message will still go to other mailing list
subscribers. One obvious case in which this would happen is if there
was a duplicate message ID.

Both the mailing-list and the gateway use the original message ID to
prevent duplicate posts/articles.

Every post/article is propagated exactly _once_.

There is no supression. There is no Cabal.

--
Steve Kostecke <kostecke at ntp.org>
NTP Public Services Project - http://support.ntp.org/

Heiko Gerstung

2008-03-31 14:53:43 UTC

Post by David Woolley
You appear to be quoting an off list reply with no indication of
permission, although it is just possible that the email gateway
forwarded it to email subscribers without forwarding it to the usenet
group proper.
Incidentally, what he's done is to run together the peers information
from many machines, so there is only one CDMA source. On the other
hand, it doesn't look like it is a CDMA appliance, or if it is, it has
been badly implemented, as I would not expect to see a local clock
driver on an appliance device.

We have that in our NTP appliances as well. You can configure it to any stratum
level you want and it is used as a last resort fallback in case the receiver
lost reception and the (also configurable) so-called trust time has passed
without the signal coming back. This results in the time server replying with
stratum 12 (for example) after a while and ensures that everybody has the same
time, although it might be wrong. If a user does not want that, they can simply
set the local clock stratum to 15 and the server will not be accepted anymore.

Can you please let me know why you consider this a "bad implementation"?

Regards,
Heiko

Post by David Woolley
[...]

David Woolley

2008-03-31 18:18:49 UTC

time has passed without the signal coming back. This results in the time
server replying with stratum 12 (for example) after a while and ensures
that everybody has the same time, although it might be wrong. If a user
does not want that, they can simply set the local clock stratum to 15
and the server will not be accepted anymore.
Can you please let me know why you consider this a "bad implementation"?

Because the protocol fails to signal the loss of the time source
properly when one has a local clock configured. As such, I believe that
enabling a local clock should always be an opt in choice. Basically,
when it falls back to the local clock, root dispersion goes to zero,
when the true situation is that root dispersion is growing without bound.

Things can go seriously wrong if there is more than one local clock
source on a network, as it becomes possible for them to outvote the real
time.

Richard B. Gilbert

2008-03-31 19:16:14 UTC

Post by David Woolley

time has passed without the signal coming back. This results in the
time server replying with stratum 12 (for example) after a while and
ensures that everybody has the same time, although it might be wrong.
If a user does not want that, they can simply set the local clock
stratum to 15 and the server will not be accepted anymore.
Can you please let me know why you consider this a "bad implementation"?

Because the protocol fails to signal the loss of the time source
properly when one has a local clock configured. As such, I believe that
enabling a local clock should always be an opt in choice. Basically,
when it falls back to the local clock, root dispersion goes to zero,
when the true situation is that root dispersion is growing without bound.
Things can go seriously wrong if there is more than one local clock
source on a network, as it becomes possible for them to outvote the real
time.

Local clock IS an opt in choice. If you don't configure it, it doesn't
serve time. Stratum is taken into account in selecting a time source.
I can't swear to it but I'd be surprised if three stratum 10 servers
could out vote one stratum 2 server.

David Woolley

2008-03-31 20:55:35 UTC

Post by Richard B. Gilbert
Stratum is taken into account in selecting a time source.
I can't swear to it but I'd be surprised if three stratum 10 servers
could out vote one stratum 2 server.

At least for RFC1305, stratum is not considered (except in as much as
refid is not checked for stratum 1) until after the intersection
algorithm has removed false tickers.

I believe there have been cases on the newsgroup in which people have
peered systems using the local clock and then had the system form a
clique which rejects real sources of time.

The main thing that would probably mitigate against this if the local
clocks were in appliances is that the local clock gets a falsely narrow
error tolerance band. With the peering configuration, the bands
overlap, but with multiple appliances, they would have to drift at very
similar rates to stay compatible. The narrow tolerance is why local
clocks that agree make it particularly difficult for a good clock to be
accepted, as the good clock has to be very close to the clique to not be
rejected.

Heiko Gerstung

2008-04-02 14:32:12 UTC

Post by David Woolley

time has passed without the signal coming back. This results in the
time server replying with stratum 12 (for example) after a while and
ensures that everybody has the same time, although it might be wrong.
If a user does not want that, they can simply set the local clock
stratum to 15 and the server will not be accepted anymore.
Can you please let me know why you consider this a "bad implementation"?

Because the protocol fails to signal the loss of the time source
properly when one has a local clock configured. As such, I believe that
enabling a local clock should always be an opt in choice. Basically,
when it falls back to the local clock, root dispersion goes to zero,
when the true situation is that root dispersion is growing without bound.

The signal is the higher stratum level, at least for a lot of SNTP
implementations. Almost noone is looking at the root dispersion value when it
comes to SNTP ...

In our web interface you can disable the use of the local clock reference
completely. I always recommend to keep it active but set its stratum to 15,
which should result in being rejected by any standards compliant client.

Running without the local clock ref means the server signals itself as being
synchronized by a stratum 0 source (e.g. GPS) and only the root dispersion value
is increasing. As I said, most embedded/SNTP-only software checks for the SYNC
status and (sometimes) stratum level.

Post by David Woolley
Things can go seriously wrong if there is more than one local clock
source on a network, as it becomes possible for them to outvote the real
time.

Yes, but I would not go that far to say that offering the end user the choice to
enable the local clock driver in his NTP appliance is a "bad implementation". I
however can fully agree that there are a number of things that could go wrong
when you use it (something that applies to a number of configuration options
like tinker or restrict ...).

Cheers,
Heiko

Unruh

2008-03-31 16:45:47 UTC

Post by David Woolley
You appear to be quoting an off list reply with no indication of
permission, although it is just possible that the email gateway
forwarded it to email subscribers without forwarding it to the usenet
group proper.

He went off list because he was banned on list for a day because of his
attempts to post graphs on line (He did not realises, as he said in his
posted post, that he could not post graphs online. He does now.)

Post by David Woolley
Incidentally, what he's done is to run together the peers information
from many machines, so there is only one CDMA source. On the other
hand, it doesn't look like it is a CDMA appliance, or if it is, it has
been badly implemented, as I would not expect to see a local clock
driver on an appliance device.

Ah, perhaps. Even then his list looks weird.

Post by David Woolley
The delays are rather large for the paragon of perfection of a network
that was described.

Yes, that was one reason I wanted to see his peerstats file as well.
loopstats has gone through the clock_filter and the selection algorithm and
gives a poor representation of what is actually on the net. But at 16sec
poll it is a pretty large file for one day. But he can graph it.

Post by David Woolley
He probably needs to be aware that normal applications on the Windows
boxes will see times with a resolution that is rather poorer than can be
seen by ntptrace, as ntptrace takes advantage of the ntpd tick
interpolation, but normal applications will see times with a resolution
of one clock tick.

starlight

2008-03-30 20:22:07 UTC

Here are URLs for those two sample graphs:

Loading Image...

Loading Image...

Post by David Woolley

Post by starlight
The clients are a rag-tag assembly of diverse systems including
a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.

How are you interpolating the 16ms ticks on the Windows system?
How are you disabling power management on the lap top?

The generic version of 'ntpd' has some sophisticated code that
handles interpolation. See the source. Power management is
disabled on the laptop using the standard control panel option.
Don't really care that much about this machine anyway.

Post by David Woolley

Post by starlight
It generally is working well, with the systems tracking anywhere
from +/- 100 microseconds to +/- 500 microseconds most of the
time.

How are you measuring the difference from true time? In principle, if
ntpd can measure it, it will correct it.

Using 'ntpd' 'loopstats'. It does, check out the graphs.

Maybe I'll turn on 'peerstats' too, but I really doubt a
stand-alone good quality switch would be causing random delays.
Pings are consistently 400 microseconds and 'ntpq -p' reports 800
microsecond roundtrip delays. I've never heard of a switch
causing a 5ms delay.

Post by David Woolley

Post by starlight
However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds. Had an issue where a UPS voltage correction shift

In which direction is the slip? Backward only slips against true time
(these might appear as forward slips if the real error is in the server)
are typically due to lost clock interrupts. If that is the case it
implies you are using a tick rate of other than 100Hz. Please note that
the Linux kernel code is broken for clock frequencies other than 100Hz
and the use of 1000Hz significantly increases the likelihood of a lost
interrupt.

Perhaps that's a problem. The RHEL/Centos stock kernel seems to
have a 1000Hz clock interrupt. At least 'vmstat' shows 1000
ints/sec on an idle system.

Post by David Woolley
The normal source of lost interrupts is disk drivers using programmed
transfers.

Think it's all DMA. Remember this is a really diverse bunch
of machines and OSs. The RS/6000 is working the best.

These jumps aren't killing me. Just want to figure out if they
can be eliminated. If we needed super accurate time we'd
probably have make use of PTP (precision timing protocol).
Still tr?s expensive.

Unruh

2008-04-01 01:09:55 UTC

Post by starlight
http://binnacle.cx/file/ntp_hickups_linux.gif
http://binnacle.cx/file/ntp_hickups_win.gif

Post by David Woolley

Post by starlight
The clients are a rag-tag assembly of diverse systems including
a Centos 4.5 Linux i686, Linux x86_64, Sun Ultra 10, Sun Ultra 80,
IBM RS/6000 44p, Windows 2003 X64, and a Windows XP laptop.

How are you interpolating the 16ms ticks on the Windows system?
How are you disabling power management on the lap top?

The generic version of 'ntpd' has some sophisticated code that
handles interpolation. See the source. Power management is
disabled on the laptop using the standard control panel option.
Don't really care that much about this machine anyway.

Post by David Woolley

Post by starlight
It generally is working well, with the systems tracking anywhere
from +/- 100 microseconds to +/- 500 microseconds most of the
time.

How are you measuring the difference from true time? In principle, if
ntpd can measure it, it will correct it.

Using 'ntpd' 'loopstats'. It does, check out the graphs.
Maybe I'll turn on 'peerstats' too, but I really doubt a
stand-alone good quality switch would be causing random delays.
Pings are consistently 400 microseconds and 'ntpq -p' reports 800
microsecond roundtrip delays. I've never heard of a switch
causing a 5ms delay.

Post by David Woolley

Post by starlight
However once or twice a day, all the systems experience a
random, uncorrelated time shift of from one to several
milliseconds. Had an issue where a UPS voltage correction shift

In which direction is the slip? Backward only slips against true time
(these might appear as forward slips if the real error is in the server)
are typically due to lost clock interrupts. If that is the case it
implies you are using a tick rate of other than 100Hz. Please note that
the Linux kernel code is broken for clock frequencies other than 100Hz
and the use of 1000Hz significantly increases the likelihood of a lost
interrupt.

Perhaps that's a problem. The RHEL/Centos stock kernel seems to
have a 1000Hz clock interrupt. At least 'vmstat' shows 1000
ints/sec on an idle system.

Post by David Woolley
The normal source of lost interrupts is disk drivers using programmed
transfers.

Think it's all DMA. Remember this is a really diverse bunch
of machines and OSs. The RS/6000 is working the best.
These jumps aren't killing me. Just want to figure out if they
can be eliminated. If we needed super accurate time we'd
probably have make use of PTP (precision timing protocol).

No idea what that is. If you had wanted super precision you would have put
a GPS onto each machine, I hope.

Post by starlight
From the Wikipedia entry on PTP it looks absolutely no different from ntp.

I have no idea what the idea is.

I highly doubt that you will get better time with PTP. NOw with chrony, my
measurements indicate that with the typical drift wander on my machines,
chorny gives 2-3 times better variance than does ntp. But it uses exactly
the same exchange protocol as ntp and uses a different clock discipline
algorithm.

Post by starlight
Still tr?s expensive.

Hal Murray

2008-04-01 04:51:48 UTC

Post by Unruh

Post by starlight
probably have make use of PTP (precision timing protocol).

No idea what that is. If you had wanted super precision you would have put
a GPS onto each machine, I hope.
From the Wikipedia entry on PTP it looks absolutely no different from ntp.
I have no idea what the idea is.

The basic idea is to do the time stamping in hardware deep in
the network adaper. That avoids lots and lots of jitter.

--
These are my opinions, not necessarily my employer's. I hate spam.

Martin Burnicki

2008-04-01 09:23:34 UTC

Post by Hal Murray

Post by Unruh

Post by starlight
probably have make use of PTP (precision timing protocol).

No idea what that is. If you had wanted super precision you would have put
a GPS onto each machine, I hope.
From the Wikipedia entry on PTP it looks absolutely no different from ntp.
I have no idea what the idea is.

The basic idea is to do the time stamping in hardware deep in
the network adaper. That avoids lots and lots of jitter.

Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
clients and the server support hardware timestamping of sent/received PTP
packets.

On the other hand, also *every* network node between the PTP endpoints has
to be PTP-aware and compensate the packet delay it introduces, so you will
probably only get full PTP accuracy in your local network where you have
control over all the equipment.

Switches can very well insert a delay in the range of milliseconds. If there
are incoming packets at different ports at the same time which shall go out
on the same port then the packets have to be queued. Unless the network is
really heavily loaded this may happen only occasionally, but it may happen.

The switches included in our PTP starter kit
http://www.meinberg.de/english/ptp-starterkit/
implement PTP boundary clocks for the ports in order to eliminate the
queuing delay. Without this special handling PTP would suffer from the same
latencies as NTP.

On the other hand, NTP yields quite good results without requiring special
hardware, even over WAN connections.

Martin

--
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

Unruh

2008-04-01 16:45:12 UTC

Post by Martin Burnicki

Post by Hal Murray

Post by Unruh

Post by starlight
probably have make use of PTP (precision timing protocol).

No idea what that is. If you had wanted super precision you would have put
a GPS onto each machine, I hope.
From the Wikipedia entry on PTP it looks absolutely no different from ntp.
I have no idea what the idea is.

The basic idea is to do the time stamping in hardware deep in
the network adaper. That avoids lots and lots of jitter.

Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
clients and the server support hardware timestamping of sent/received PTP
packets.

I am still confused. To timestamp you have to read the computer's clock.
That is a software operation-- reading the counter in the cpu, translating
to time, returning the result through the kernel, etc. That has all kinds
of variable latencies,etc. I am having trouble seeing 100ns. Also seeing
the PPS from the hardware clock and its interrupts. Or are you replacing
all of the hardware and software of the system? (new kernel, new interrupt
system, new nics, etc)

Post by Martin Burnicki
On the other hand, also *every* network node between the PTP endpoints has
to be PTP-aware and compensate the packet delay it introduces, so you will
probably only get full PTP accuracy in your local network where you have
control over all the equipment.
Switches can very well insert a delay in the range of milliseconds. If there
are incoming packets at different ports at the same time which shall go out
on the same port then the packets have to be queued. Unless the network is
really heavily loaded this may happen only occasionally, but it may happen.
The switches included in our PTP starter kit
http://www.meinberg.de/english/ptp-starterkit/
implement PTP boundary clocks for the ports in order to eliminate the
queuing delay. Without this special handling PTP would suffer from the same
latencies as NTP.
On the other hand, NTP yields quite good results without requiring special
hardware, even over WAN connections.
Martin
--
Martin Burnicki
Meinberg Funkuhren
Bad Pyrmont
Germany

Hal Murray

2008-04-01 18:29:52 UTC

Post by Unruh

Post by Martin Burnicki

Post by Hal Murray
The basic idea is to do the time stamping in hardware deep in
the network adaper. That avoids lots and lots of jitter.

Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
clients and the server support hardware timestamping of sent/received PTP
packets.

I am still confused. To timestamp you have to read the computer's clock.
That is a software operation-- reading the counter in the cpu, translating
to time, returning the result through the kernel, etc. That has all kinds
of variable latencies,etc. I am having trouble seeing 100ns. Also seeing
the PPS from the hardware clock and its interrupts. Or are you replacing
all of the hardware and software of the system? (new kernel, new interrupt
system, new nics, etc)

You can build a clock into the network adapter and sync it up to the
system clock.

--
These are my opinions, not necessarily my employer's. I hate spam.

Unruh

2008-04-01 19:45:46 UTC

Post by Hal Murray

Post by Unruh

Post by Martin Burnicki

Post by Hal Murray
The basic idea is to do the time stamping in hardware deep in
the network adaper. That avoids lots and lots of jitter.

Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
clients and the server support hardware timestamping of sent/received PTP
packets.

I am still confused. To timestamp you have to read the computer's clock.
That is a software operation-- reading the counter in the cpu, translating
to time, returning the result through the kernel, etc. That has all kinds
of variable latencies,etc. I am having trouble seeing 100ns. Also seeing
the PPS from the hardware clock and its interrupts. Or are you replacing
all of the hardware and software of the system? (new kernel, new interrupt
system, new nics, etc)

You can build a clock into the network adapter and sync it up to the
system clock.

And how do you sync it up to the system clock without going through the
kernel, etc? Ie, I have a clock on my gps receiver that is good to 100ns.
It links to the system clock via interrupts and ntp. You have to do
something like that if you are going to sync the clock on your nic to the
system clock as well. Ie, I see no advantage to this procedure over putting
in a cheap gps clock on each of the computers and just using that ( or
running a buffered PPS line from one gps receiver to each of the machines
(using some of the spare lines in a Cat5e cable if need be).Sure sounds
cheaper than special nic cards with high accuracy on board clocks!

Post by Hal Murray
--
These are my opinions, not necessarily my employer's. I hate spam.

Martin Burnicki

2008-04-02 09:50:11 UTC

Post by Unruh

Post by Hal Murray
You can build a clock into the network adapter and sync it up to the
system clock.

And how do you sync it up to the system clock without going through the
kernel, etc?

Maybe it's better to do it the other way round, i.e. sync the system clock
to the NIC's timestamp counter.

Post by Unruh
Ie, I have a clock on my gps receiver that is good to 100ns.
It links to the system clock via interrupts and ntp. You have to do
something like that if you are going to sync the clock on your nic to the
system clock as well. Ie, I see no advantage to this procedure over
putting in a cheap gps clock on each of the computers and just using that
( or running a buffered PPS line from one gps receiver to each of the
machines (using some of the spare lines in a Cat5e cable if need be).Sure
sounds cheaper than special nic cards with high accuracy on board clocks!

Just like an NTP server a PTP/IEEE1588 grandmaster can synchronize a huge
number of clients. Depending on the application this can either be the
"official" UTC time, or just the "same" time for all devices.

The target for PTP is more in industrial applications, where you have a
dedicated network environment and more and more embedded devices which can
be synchronized with high accuracy.

For example, the new LXI standard (Lan EXtensions for Instrumentation, see
http://en.wikipedia.org/wiki/LXI) which is a LAN-based successor to the old
GPIB bus, explicitely uses PTP/IEEE1588 to time-trigger measurements
accurately in spite of network latencies.

There are now NIC chips available which support PTP timestamping, and in
many measurement instruments there is a good oscillator which can be used
both for time stamping and implementation of the system clock. Those
devices normally have special printed circuit boards and dedicated software
which supports that on-board hardware.

So this is different from using PTP to synchronize a standard PC where you
don't know which OS or version of an OS is running, and which types of
hardware (e.g. NICs) are installed, and how to synchronize the system time
to the counter chain on a PCI card or whatever.

Martin

--
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

Martin Burnicki

2008-04-02 08:53:58 UTC

Bill,

Post by Unruh

Post by Martin Burnicki
Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
clients and the server support hardware timestamping of sent/received PTP
packets.

I am still confused. To timestamp you have to read the computer's clock.
That is a software operation-- reading the counter in the cpu, translating
to time, returning the result through the kernel, etc. That has all kinds
of variable latencies,etc. I am having trouble seeing 100ns. Also seeing
the PPS from the hardware clock and its interrupts. Or are you replacing
all of the hardware and software of the system? (new kernel, new interrupt
system, new nics, etc)

Yes, maybe I've been a little bit too unspecific here.

Those timestamps are taken from a local oscillator, e.g. on the NIC board,
and that oscillator can be disciplined with the mentioned accuracy.

Most of those devices also contain a hardware PPS output, so you can use an
oscilloscope to compare the PPS output of the PTP slave to the PTP output
of the PTP grandmaster, and this is where you can see which accuracy you
can get using the PTP protocol, and you can also see that you may not get
that accuracy if you use switches which are not PTP-aware.

BTW, we've made some tests and could see that you can yield the same
accuracy with NTP and hardware timestamping.

A different story is of course how you get the accurate time from the NIC's
oscillator/counter to the kernel's system time.

This introduces the latencies you mentioned, and those latencies occur
regardless of whether you are using a NIC with timestamp counter, or GPS
PCI card, or even when evaluating an incoming PPS signal.

The latter also depends strongly on the operating system, i.e. the
resolution of the system clock (1 microsecond or better under most
Unix-like systems, about 16 milliseconds under Windows, except Vista).

Martin

--
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

Hal Murray

2008-04-01 18:38:01 UTC

Post by Martin Burnicki
Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
clients and the server support hardware timestamping of sent/received PTP
packets.
On the other hand, also *every* network node between the PTP endpoints has
to be PTP-aware and compensate the packet delay it introduces, so you will
probably only get full PTP accuracy in your local network where you have
control over all the equipment.

Suppose I have PTP network adapters but vanilla switches and my
network is lightly loaded.

Can I filter out the delays in the switches by sending 10 packets
and throwing out the ones with long delays? I'd expect the
timings to be a cluster around the case where there was no delay
in the switch and a tail for the ones that encountered some
delay. I think it would be easy to filter out that tail.

--
These are my opinions, not necessarily my employer's. I hate spam.

Unruh

2008-04-01 19:50:18 UTC

Post by Hal Murray

Post by Martin Burnicki
Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
clients and the server support hardware timestamping of sent/received PTP
packets.
On the other hand, also *every* network node between the PTP endpoints has
to be PTP-aware and compensate the packet delay it introduces, so you will
probably only get full PTP accuracy in your local network where you have
control over all the equipment.

Suppose I have PTP network adapters but vanilla switches and my
network is lightly loaded.
Can I filter out the delays in the switches by sending 10 packets
and throwing out the ones with long delays? I'd expect the
timings to be a cluster around the case where there was no delay
in the switch and a tail for the ones that encountered some
delay. I think it would be easy to filter out that tail.

ntp already does. It throws away 85% of the packets it gets, keeping only
the roughly 1/8 of them with the shortest round trip times (a real waste
of data I think but it certainly solves your problem).

Heiko Gerstung

2008-04-02 14:51:24 UTC

Post by Hal Murray

Post by Martin Burnicki
Yes, PTP can yield an accuracy better than 100 ns if both the NICs at the
clients and the server support hardware timestamping of sent/received PTP
packets.
On the other hand, also *every* network node between the PTP endpoints has
to be PTP-aware and compensate the packet delay it introduces, so you will
probably only get full PTP accuracy in your local network where you have
control over all the equipment.

Suppose I have PTP network adapters but vanilla switches and my
network is lightly loaded.
Can I filter out the delays in the switches by sending 10 packets
and throwing out the ones with long delays? I'd expect the
timings to be a cluster around the case where there was no delay
in the switch and a tail for the ones that encountered some
delay. I think it would be easy to filter out that tail.

Yes, that is possible. The main problem with vanilla switches is the asymmetric
delays you get when the (store-n-forward) switch starts to queue packets due to
higher network load. This is a rare occurence in lightly loaded networks.

The PTP standard (IEEE1588) describes how a client (called slave in PTP
terminology) finds out the offset between his own clock and the servers
("master") clock. It does not specify what you do with this offset, i.e. you can
step your clock or apply small corrections in order to keep things running
smoothly. There is nothing that prevents you from applying filters and
statistics on the offset values before adjusting your clock.

If you want to play around with PTP, you can get a free (software-only) version
at Sourceforge: ptpd.sf.net

The developers of this implementation state that "PTPd should be able to
coordinate the clocks of your computers within tens of microseconds", which is
around the performance of ntpd. As Martin already said, you can get NTP to be as
accurate as PTP if you add hardware timestamping and you can get PTP to be as
accurate as NTP by taking the hardware timestamping away from it.

Best Regards,
Heiko

Unruh

2008-04-01 16:39:57 UTC

Post by Hal Murray

Post by Unruh

Post by starlight
probably have make use of PTP (precision timing protocol).

No idea what that is. If you had wanted super precision you would have put
a GPS onto each machine, I hope.
From the Wikipedia entry on PTP it looks absolutely no different from ntp.
I have no idea what the idea is.

The basic idea is to do the time stamping in hardware deep in
the network adaper. That avoids lots and lots of jitter.

It avoids some jitter. Does that mean that you have to have special
hardware (special network cards, or special network card drivers?)
It does nothing for the 300us jitter I see on my ADSL connected computer.
It might do something for the 10us jitter I see on my ethernet connected
lan-- probably take it down to 8us or something (has anyone tested where
the jitter is-- in the network cards or in the switches?)

Post by Hal Murray
--
These are my opinions, not necessarily my employer's. I hate spam.

Hal Murray

2008-04-01 18:33:57 UTC

Post by Unruh

Post by Hal Murray
The basic idea is to do the time stamping in hardware deep in
the network adaper. That avoids lots and lots of jitter.

It avoids some jitter. Does that mean that you have to have special
hardware (special network cards ...

Yes, and so far they are all expensive.

--
These are my opinions, not necessarily my employer's. I hate spam.

David Woolley

2008-04-01 10:19:44 UTC

Post by starlight
The generic version of 'ntpd' has some sophisticated code that
handles interpolation. See the source. Power management is

I know that. But the problem is that normal applications just get a
more accurate time for the most recent tick, but still don't see any
times between ticks.

Post by starlight
Pings are consistently 400 microseconds and 'ntpq -p' reports 800

Which is excessive for 1GHz network doing essentially nothing but NTP.

Post by starlight
probably have make use of PTP (precision timing protocol).
Still tr?s expensive.

I assume by PTP you mean ethernet cards that extract a timestamp with a
very low latency. I doubt that this will help with lost interrutps. If
you really want extreme accuracy for applications you need to:

1) use hardware that maintains a high resolution time completely
independent of the software and is directly readable by application code
(I'm not sure if Windows supports such direct reading).

2) you will need to add code to the device drivers that actually
communicate the real world events that you interested in to the
software, to read from that special clock very early in their ISR
(better still devices that will read it using DMA).

Danny Mayer

2008-04-03 03:04:30 UTC

Post by David Woolley

Post by starlight
The generic version of 'ntpd' has some sophisticated code that
handles interpolation. See the source. Power management is

I know that. But the problem is that normal applications just get a
more accurate time for the most recent tick, but still don't see any
times between ticks.

Post by starlight
Pings are consistently 400 microseconds and 'ntpq -p' reports 800

Which is excessive for 1GHz network doing essentially nothing but NTP.

Post by starlight
probably have make use of PTP (precision timing protocol).
Still tr?s expensive.

I assume by PTP you mean ethernet cards that extract a timestamp with a
very low latency. I doubt that this will help with lost interrutps. If
1) use hardware that maintains a high resolution time completely
independent of the software and is directly readable by application code
(I'm not sure if Windows supports such direct reading).

I suspect that what is being discussed here is IEEE1588 which can
timestamp packets via the hardware. It requires device driver support
and a number of other changes to NTP to work with it.

Danny

starlight

2008-03-31 01:05:31 UTC

Are those on the same day?

Yes, same day. Uncorrelated to anything I can identify
or each other. Same story on all the boxes. Running
a hefty multi-system compile with heavy NFS and Samba
traffic does not produce these events, though it disturbs
the Windows boxes slightly when CPU goes to 100%.

Which "linux" and which "windows" are those graphs since you
have 2 linux and 2 windows clients.

That's the dual-core AMD 2.4GHz Athlon Tyan mobo whitebox
runing Centos 4.5 SMP kernel. Similar results on the
Dell Dimension 2400 2.4GHz Intel P4 running Centos 4.5
mono-processor kernel.

Windows is a dual-core 3.4GHz Pentium D Tyan mobo whitebox
running 2003 R2 SP2 standard server.

As I said, seeing the
peerstats files would be helpful (offset and roundtrip)

Might try them later, but I can't belive a high-quality
SMC switch is causing multi-millisecond delays. Just not
possible. Pings are all about 400 microseconds, consistent
but slightly different on each system. Round trip is
800 microseconds. Attaching the output from a bulk 'ntpq -p'
'ntptrace' script I have below. Note that's 'ntptrace'
version 4.1 since the 4.2 script has useless offset info.

Also these graphs seem to have cut off the spikes. Are the
spikes actaully higher or is that an illusion?

Higher. Sometimes 1ms, sometimes 5-6ms.

(Note the spikes are hundreds of usec, not many msec)

That would be the ~1ms example, check out the other one.

remote refid st t when poll reach delay offset jitter
==============================================================================
Endrun CDMA
LOCAL(0) LOCAL(0) 10 l 18 64 377 0.000 0.000 0.015
*HOPF_S(0) .CDMA. 0 l 6 16 377 0.000 0.000 0.015
Centos 32
*eachna .CDMA. 1 u 3 16 377 0.683 -0.004 0.009
-tock.usno.navy. .USNO. 1 u 452 1024 377 20.678 1.432 2.822
+navobs1.wustl.e .GPS. 1 u 479 1024 377 50.136 -1.513 0.164
+time.nist.gov .ACTS. 1 u 471 1024 377 66.528 -1.708 0.156
-tick.ucla.edu .GPS. 1 u 432 1024 377 87.372 3.296 0.085
Ultra 10
*172.29.87.3 .CDMA. 1 u 11 16 377 0.869 -0.016 0.042
172.29.87.15: stratum 2, offset -0.000007, synch distance 0.00783
172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
Ultra 80
*172.29.87.3 .CDMA. 1 u 4 16 377 0.942 -0.012 0.012
172.29.87.17: stratum 2, offset -0.000038, synch distance 0.00685
172.29.87.3: stratum 1, offset -0.000017, synch distance 0.00038, refid 'CDMA'
44p
*172.29.87.3 .CDMA. 1 u 13 16 377 0.809 -0.001 0.016
172.29.87.13: stratum 2, offset -0.000014, synch distance 0.00627
172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
Centos 64
*172.29.87.3 .CDMA. 1 u 12 16 377 0.664 0.003 0.487
172.29.87.19: stratum 2, offset -0.000009, synch distance 0.00720
172.29.87.3: stratum 1, offset -0.000018, synch distance 0.00038, refid 'CDMA'
W2K3 64
*172.29.87.3 .CDMA. 1 u 4 16 377 0.734 0.053 0.014
172.29.87.20: stratum 2, offset -0.000060, synch distance 0.00650
172.29.87.3: stratum 1, offset -0.000019, synch distance 0.00038, refid 'CDMA'
XP 32 laptop
*172.29.87.3 .CDMA. 1 u 7 16 377 0.819 0.468 0.256
172.29.87.12: stratum 2, offset -0.000173, synch distance 0.00655
172.29.87.3: stratum 1, offset -0.000017, synch distance 0.00038, refid 'CDMA'

Ryan Malayter

2008-04-01 21:58:49 UTC

Might try them later, but I can't belive a high-quality
SMC switch is causing multi-millisecond delays. ?Just not

Do you have access to a different (Cisco, Extreme, Foundry, or HP)
switch for testing? If not, try a crossover cable between the NTP
server and one of the systems. If the problem disappears, you'll know
the switch was the culprit.

We've seen lots of strange issues with less expensive switches
(NetGear, similar to SMC) that just don't happen with the more
expensive brands. You often get what you pay for.

Starlight Binnacle

2011-09-10 19:33:45 UTC

Original author of the post here. Just stumbled upon this thread (three years on) as Google has assigned it a relatively high rank.

An update is in order.

Discovered not long after posting the question that the Endrun Praecis Cntp device has some sort of bug. On occasion the 'ntpd' daemon running on it returns an insane reply that smacks the client NTP filter logic upside the head and sends the NTP client skewing off wildly. Takes several subsequent polls for things to settle back down. The problem was apparent once a graph of statistics was plotted.

Wrote a short patch that filters out the crazy response packets and this fixes the problem. Still running perfectly to this day.

If anyone wants the patch just email me at < starlight at binnacle dot cx >..

44 Replies
5 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

starlight 2008-03-30 17:09:43 UTC

Richard B. Gilbert 2008-03-30 18:38:55 UTC

Unruh 2008-03-30 21:39:01 UTC

Maarten Wiltink 2008-03-30 22:21:07 UTC

David Woolley 2008-03-30 23:30:37 UTC

Unruh 2008-03-31 04:50:00 UTC

David Woolley 2008-03-31 08:55:54 UTC

Richard B. Gilbert 2008-03-30 22:30:53 UTC

David Woolley 2008-03-30 22:49:02 UTC

Unruh 2008-03-30 18:57:32 UTC

Maarten Wiltink 2008-03-30 21:43:07 UTC

David Woolley 2008-03-30 19:10:21 UTC

Unruh 2008-03-30 21:43:40 UTC

David Woolley 2008-03-30 23:00:58 UTC

Hal Murray 2008-03-30 22:47:50 UTC

Bill Unruh 2008-03-31 04:36:09 UTC

David Woolley 2008-03-31 09:29:13 UTC

Steve Kostecke 2008-03-31 11:46:11 UTC

David Woolley 2008-03-31 12:57:54 UTC

Steve Kostecke 2008-04-01 13:59:38 UTC

Heiko Gerstung 2008-03-31 14:53:43 UTC

David Woolley 2008-03-31 18:18:49 UTC

Richard B. Gilbert 2008-03-31 19:16:14 UTC

David Woolley 2008-03-31 20:55:35 UTC

Heiko Gerstung 2008-04-02 14:32:12 UTC

Unruh 2008-03-31 16:45:47 UTC

starlight 2008-03-30 20:22:07 UTC

Unruh 2008-04-01 01:09:55 UTC

Hal Murray 2008-04-01 04:51:48 UTC

Martin Burnicki 2008-04-01 09:23:34 UTC

Unruh 2008-04-01 16:45:12 UTC

Hal Murray 2008-04-01 18:29:52 UTC

Unruh 2008-04-01 19:45:46 UTC

Martin Burnicki 2008-04-02 09:50:11 UTC

Martin Burnicki 2008-04-02 08:53:58 UTC

Hal Murray 2008-04-01 18:38:01 UTC

Unruh 2008-04-01 19:50:18 UTC

Heiko Gerstung 2008-04-02 14:51:24 UTC

Unruh 2008-04-01 16:39:57 UTC

Hal Murray 2008-04-01 18:33:57 UTC

David Woolley 2008-04-01 10:19:44 UTC

Danny Mayer 2008-04-03 03:04:30 UTC

starlight 2008-03-31 01:05:31 UTC

Ryan Malayter 2008-04-01 21:58:49 UTC

Starlight Binnacle 2011-09-10 19:33:45 UTC

about - legalese

Loading...