An Active Directory forest that does not replicate is on a countdown to a real outage β every hour without replication is more drift, more inconsistency, more chance of authentication failures, ACL surprises, and the kind of bad-day scenarios that take days to recover from. The two tools you reach for are repadmin and dcdiag, and the workflow has been the same for fifteen years. This is the practical reference: the commands you actually run, the error codes you actually see, and the recovery procedures for the bad cases (lingering objects, USN rollback).
Free PDF cheat sheet at the end.
Table of Contents
The standard triage workflow
repadmin /replsumβ quick health summary of every DC, every partition.repadmin /showreplβ detailed last-replication status per DC.dcdiag /vβ comprehensive health of the local DC.dcdiag /test:DNSβ DNS-specific health (most common root cause).- Identify the failing partition + the error code.
- Fix the underlying cause (DNS, time, network, expired tombstone).
repadmin /syncall /AdePβ force resync.
repadmin /showrepl
# Run on the DC you want to inspect
repadmin /showrepl
# Or from anywhere, against a remote DC
repadmin /showrepl dc01.contoso.com
Look for "Last attempt" with errors, "Last success" timestamps that are stale (anything more than the replication schedule + a buffer), and any partition listed at the top of the output.
DSA Options: IS_GC
Site Options: (none)
DSA object GUID: 1234abcd-...
DSA invocationID: 5678efgh-...
==== INBOUND NEIGHBORS ======================================
DC=contoso,DC=com
SiteB\DC02 via RPC
DSA object GUID: ...
Last attempt @ 2026-04-17 10:20:01 was successful.
"was successful" everywhere = healthy. Anything else = drill down.
repadmin /replsum
repadmin /replsum /bysrc /bydest /sort:delta
Output is two tables β one by source DC, one by destination β with delta (time since last attempt), fails/total, and the latest error. Sorted by delta this is your highest-signal one-screen health view.
Source DSA largest delta fails/total %% error
DC01 :02:15 0 / 10 0
DC02 :08:31 12 / 18 67 (8606) Insufficient attributes
dcdiag
# Comprehensive (long output)
dcdiag /v
# Just the failing tests
dcdiag /v | Select-String "passed|failed"
# Specific test
dcdiag /test:DNS /e
dcdiag /test:Replications /e
dcdiag /test:Advertising /e
# Run from a workstation, against a specific DC
dcdiag /s:dc01.contoso.com /test:Replications
/e means "all DCs in the enterprise". Useful for the big picture; slow on large forests.
Force a replication
# Pull from one specific source partner
repadmin /replicate "DC=contoso,DC=com"
# Trigger every DC to pull every partition from every partner
repadmin /syncall /AdeP
# Just the local DC, all partitions, push and pull
repadmin /syncall /Aed
Switches: A=all partitions, e=enterprise (cross-site), P=push, d=identify by DN, q=quiet.
Common error codes
| Code | Meaning | Fix |
|---|---|---|
| 1908 | Could not find a domain controller | DNS β check SRV records and IP routing |
| 1722 | RPC server unavailable | Firewall / network / dynamic RPC ports |
| 1727 | RPC endpoint mapper unreachable | TCP/135 blocked, or RPCSS service down |
| 1753 | No more endpoints from endpoint mapper | RPC dynamic port range blocked |
| 5 | Access is denied | Time skew >5 min, or trust broken |
| 8418 | Replication access was denied | Remove dead DC metadata; reset trust |
| 8453 | Replication access was denied (similar) | Same as 8418 |
| 8606 | Insufficient attributes (lingering object) | repadmin /removelingeringobjects |
| 8614 | Tombstone lifetime exceeded | Reinstall failing DC from clean state |
| 8456 | Source server not advertising | dcdiag /test:Advertising on source |
Lingering objects
A lingering object is a deleted object that one DC missed and now tries to replicate to a peer that has long since pruned it. Symptom: error 8606 in repadmin /showrepl. Fix:
# 1. Find them - reference DC must be a known-good GC
repadmin /removelingeringobjects "DC=contoso,DC=com" /advisory_mode
# 2. After confirming, run without /advisory_mode to actually remove
repadmin /removelingeringobjects "DC=contoso,DC=com"
referenceDCguid is the DSA object GUID from repadmin /showrepl on the healthy DC.
USN rollback
Most often happens when someone restores a DC from a snapshot. Replication stops. Symptom: event 2095 on the offending DC, replication paused. The fix is unfortunately blunt:
- Demote the rolled-back DC (force demote with
dcpromo /forceremovalif needed). - Run
ntdsutil metadata cleanupon a healthy DC to remove the object. - Reinstall Windows on the rolled-back box and promote it again as a fresh DC.
Prevention: never restore a DC from a snapshot. Use AD-aware backup tools that handle the InvocationID correctly.
PowerShell wrappers
The ActiveDirectory module has cleaner alternatives for some of these:
# Health summary in one cmdlet
Get-ADReplicationFailure -Scope Domain |
Format-Table Server, Partner, FailureCount, FirstFailureTime, LastError
# Site / partner topology
Get-ADReplicationSiteLink -Filter * | Select-Object Name, ReplicationFrequencyInMinutes, Cost
Get-ADReplicationConnection -Filter *
# Force replication (PowerShell flavour)
Sync-ADObject -Object "CN=jsmith,OU=Users,DC=contoso,DC=com" -Source dc01 -Destination dc02
Cheat sheet
The full triage flow + every error code + recovery on a single PDF: AD Replication Cheat Sheet.
FAQ
What is the tombstone lifetime in 2026?
180 days for forests created on Server 2003 SP1 or later (which is virtually everyone). Replication failures longer than the tombstone lifetime require reinstalling the DC from scratch.
How can I monitor replication automatically?
Schedule repadmin /replsum via a daily PowerShell script that emails or pages on any non-zero failure count. Or use the AD module: (Get-ADReplicationFailure -Scope Domain).Count -gt 0.
Is it safe to run dcdiag during business hours?
Yes β read-only operations. The bigger concern is the load it generates on remote DCs when run with /e on a large forest.
What does "DSA object GUID" actually identify?
Each DC has a DSA (Directory System Agent) object inside the configuration partition. The GUID identifies that specific DSA across the forest. It is what other DCs use to address replication partners.
repadmin says replication is fine but users complain group memberships are stale β what now?
Replication may be working at the partition level but the specific change has not propagated yet. Sync-ADObject on the affected user/group forces immediate replication to a target DC. Check Get-ADReplicationUpToDatenessVectorTable for the per-DC USN status.
Can I delete a failed DC without bringing it back online?
Yes β ntdsutil metadata cleanup from a healthy DC, follow the prompts to select the dead DC, and confirm. Always verify all FSMO roles have been moved off the dead DC first.
Why is my new site not replicating at all?
Either no site link exists between sites, or the new site has no DC assigned, or the site link cost/schedule blocks transmission. Check Get-ADReplicationSiteLink and the Sites snap-in.