TimeKeeping

From OLPC
Revision as of 13:45, 18 November 2007 by Hal.murray (talk | contribs) (Kernel Timekeeping)
Jump to: navigation, search

Kernel Timekeeping

Internally, Linux maintains the time as seconds and nanoseconds.

The kernel time is UTC. Converting to local time is a UI issue.

Most PCs have 2 crystals that can be used for timekeeping.

  • The crystal that feeds the CPU clock is usually available through some sort of cycle-counting register, for example TSC on Intel systems. When you want to know what time it is, you read that register, subtract the previous reading, scale by the speed of your CPU, add that to the time in memory, and return that time.
  • There is 32 KHz crystal that drives the TOY clock, aka hwclock. It's battery backed so your system knows the time when you first turn it on. That clock generates scheduler interrupts HZ times per second. (HZ is a kernal compile time parameter.) On every scheduler interrupt, you bump the time in memory by X microseconds. HZ is typically a coarse tick size so most implementations use the cycle counter to interpolate between ticks.

That's a rough description. Here are some details.

  • The XO kernel is currently using the TSC counter. Unfortunately, there is a bug (not XO specific) in the CPU calibration code. It doesn't get a consistent answer. It doesn't vary by much, but it's easy to notice if you are trying to keep good time, especially if you reboot frequently.
  • The crystals as shipped from the factory are not right on. They are typically off by a few parts per million. My XO is off by ~6 ppm. I have two PCs that are off by >100 PPM. 1 PPM is roughly 1 second per week. Most kernels have a fudge-factor to correct for that. This makes a huge difference. (At least if you are a timekeeping nut.) NTP calls this parameter "drift".
    • That drift is temperature dependent. It's (very) roughly 1 PPM per degree C.
    • You can use tickadj to tweak it by hand. You can compute the value with the following recipe:
      • Reboot the system (to make sure things are clean)
      • Set the clock by hand.
      • Wait a week.
      • Compute the correction.
  • Missed interrupts cause problems if you are using the scheduler interrupts to bump the clock. That often happens if HZ is set to 1000. 250 usually works.
  • Using the TSC for timekeeping gets complicated if the CPU speed is adjusted on the fly to save power or avoid overheating.
  • I don't know how turning off the CPU to save power is going to interact with timekeeping.

Keeping your Time Correct

Even if you start with the correct time, you system clock will drift, just like your watch drifts.

There are several general approaches to keeping your clock correct.

  • You can set the time by hand whenever you notice that your clock is off too far.
  • You can use sntp (ntpdate is depricated) to set the clock at boot time and then repeat occasionally (hours, days, or weeks) after that via a cron job. This is generally good enough for human interaction and has minimal system load.
  • The normal approach for most *nix system is to run ntpd. It's included in most distributions. I think it's in the XO but not started at boot time. To turn it on:
 /sbin/chkconfig --add ntpd
 /sbin/chkconfig --levels 35 ntpd on

but see below.


ntpd

ntpd is a typical unix daemon. It is normally started at boot time. It talks to several ntp servers on the network, does lots of sanity checking, and adjusts the time on your local system. It also "disciplines" the clock, that is it adjusts the "drift" fudge factor so that your system keeps pretty good time if you turn off ntpd or ntpd can't contact any servers.

ntpd is both a client and a server.

ntpd servers are organized into levels called "stratum". Stratum 1 servers have atomic clocks or some other magic to tell them the time. Stratum 2 servers get their time from stratum 1 servers. ...

ntpd adjusts the polling interval to minimize the load on the servers. It starts at 64 seconds and stretches out to 1024 seconds.

ntpd assumes that the path delays to the servers are symmetric. It works better if the servers are close. (That's close in network performance rather than kilometers.)

Normally, ntpd makes small adjustments to the local clock by slewing it. The clock speed is adjusted by 500 PPM by temporarily making the ticks bigger or smaller or the CPU speed faster or slower. Most applications don't notice and it avoids troubles with time going backwards. If the change is too big, ntpd will step the clock. (If the correction is huge, ntpd will exit/crash)

using ntpd requires public servers. They used to be provided as a public service by various schools, companies, and government agencies.

There are major problems with zillions of users overwhelming the major public servers. (This is complicated by brain damaged implementations http://en.wikipedia.org/wiki/NTP_server_misuse_and_abuse .)

There is a pool project to setup lots of secondary servers and make them available via DNS. Most Linux and Unix distributions now come with ntp.conf setup to use the pool. The main problem with using the pool is that it doesn't magically pick good nearby servers. You are likely to get one with a long network delay.

For under $100 you can get a GPS receiver and run your own stratum 1 time server.


ntpd on XO

Bugs:

  • The CPU calibration factor doesn't get initialized correctly. (See above.)
  • ntpd gets confused if there are no network interfaces. NetworkManager does this. (The fix is simple. See me if you need it. Or just restart ntpd by hand: /sbin/service ntpd start)
  • ntpd expects to store the "drift" in the file system. XOs come setup to put that in RAM where it gets deleted on reboot. This means that ntpd takes a several hours to stabalize each time you boot it.


Hal Murray, Nov 14, 2007