A drifting clock is one of the cruelest bugs to debug. Logs from two servers do not interleave, Kerberos tickets reject, TLS certificates appear expired or not-yet-valid, distributed databases reject quorum, and your monitoring dashboards lie. The fix โ keep every server within a few milliseconds of UTC โ is one of the easiest reliability investments you can make.
The state of NTP in 2026
Three implementations dominate: chrony (default on RHEL family and Ubuntu 20.04+), systemd-timesyncd (lightweight SNTP client on Debian/Ubuntu desktop), and ntpd (legacy, still widely deployed). For servers, chrony is the right default โ it converges faster after a network outage and handles intermittent connectivity (laptops, VMs that suspend) gracefully.
Installing and configuring chrony
sudo apt install chrony # Debian/Ubuntu
sudo dnf install chrony # RHEL/Fedora
sudo systemctl enable --now chronyd
Edit /etc/chrony/chrony.conf (Debian) or /etc/chrony.conf (RHEL):
pool 2.pool.ntp.org iburst
server time.cloudflare.com iburst nts # NTS-secured upstream
makestep 1.0 3
rtcsync
leapsectz right/UTC
driftfile /var/lib/chrony/chrony.drift
logdir /var/log/chrony
iburst sends a quick burst on startup so the clock converges in seconds, not minutes. makestep 1.0 3 says: if the clock is more than 1 s off, step it (instead of slewing) โ but only for the first 3 corrections, then permanent slew-only afterward, so logs do not jump backward.
Using NTS for tamper resistance
Plain NTP is unauthenticated and trivial to spoof on a hostile network. NTS (Network Time Security, RFC 8915) wraps NTP in TLS-protected key exchange. Cloudflare, NetNod, and others offer free public NTS servers:
server time.cloudflare.com iburst nts
server nts.netnod.se iburst nts
Verify NTS is active with chronyc -N authdata โ the column "KStatus" should be OK and "KeyID" non-zero.
Verifying synchronization
chronyc tracking # offset, frequency, last update
chronyc sources -v # per-source state, stratum, reach
chronyc sourcestats # statistics for each source
timedatectl status # system view; "synchronized: yes"
Healthy chronyc tracking output: System time within ยฑ1 ms of NTP time, RMS offset under 0.5 ms, "Leap status: Normal." If "Reference ID" is 00000000 or "Stratum" is 0, sync is not happening.
What to monitor
Push two metrics into your time-series store:
- System time offset from reference โ alert when |offset| > 100 ms for sustained periods.
- Last update age โ alert if no successful sync in the last 30 minutes.
Prometheus's node_exporter exposes node_timex_offset_seconds; chrony's textfile exporter or chrony_exporter add per-source detail.
Hardware clock and timezone
The kernel keeps two clocks: the system clock (RTC-backed at boot, kernel-tracked thereafter) and the hardware clock (BIOS RTC). Decisions:
sudo timedatectl set-timezone UTC
sudo timedatectl set-local-rtc 0 # RTC in UTC, recommended for servers
sudo hwclock --systohc # write current time to RTC
Always run servers in UTC. Display in local time per user via TZ environment variable; storing logs in UTC prevents the annual DST off-by-one.
Air-gapped and isolated networks
If servers cannot reach the internet, build an internal time hierarchy:
- One or two servers reach a GPS receiver, an authoritative internal NTP, or a public NTP via a forwarding proxy.
- Configure them as
local stratum 5andallow 10.0.0.0/8in chrony.conf. - All other hosts use those internal servers.
# Internal authoritative server
local stratum 5
allow 10.0.0.0/8
server time.cloudflare.com iburst nts
Container considerations
Containers share the host clock โ never run NTP inside a container. Time inside a container is exactly the host time. If you see clock drift inside a container, fix the host. Avoid --privileged just to "let the container set time."
Troubleshooting
- "sources show ?". The host cannot reach UDP/123 to the configured peers. Check egress firewall.
- Stratum 16, no leader. Chrony has no source it considers trustworthy yet. Wait 64 s for the first sample, or check
chronyc activity. - Clock jumped backward and broke a database. Disable
makestepafter initial sync (set the second number to0) so chrony only ever slews. - VM clock drifts after host suspend. Enable kvm-clock or hyperv-clock paravirt drivers; set
vm.timer_migration=0.
Quick checklist
- chrony installed and enabled.
- At least two pool entries plus one specific NTS-capable server.
- Hardware clock set to UTC.
- Timezone set to UTC.
- Monitoring on offset and last-sync age.
Spend 15 minutes hardening NTP on every server, push a single set of monitoring rules, and you will never again diagnose a "ghost" Kerberos failure or stare at out-of-order log lines wondering which server is lying about the timestamp.