"The server feels slow." Three words that lead to hours of guesswork unless you know exactly which Windows performance counters to look at and how to capture them. Task Manager shows a moment in time. Resource Monitor is better but still ad-hoc. The proper toolkit is Get-Counter (PowerShell), PerfMon Data Collector Sets (long-running captures), and a focused list of about 20 counters that matter for 95% of real-world performance problems.
This guide is the practical reference: which counters to look at for CPU, memory, disk, and network bottlenecks; how to capture them on a schedule; how to chart and interpret the results. Free PDF cheat sheet at the bottom.
Table of Contents
The three tools
- Get-Counter โ PowerShell, scriptable, real-time. Use for ad-hoc checks and short-window scripted captures.
- PerfMon (perfmon.exe) โ GUI for live charts and Data Collector Sets (DCS). Use for longer captures (hours, days) into a binary file.
- Resource Monitor (resmon.exe) โ Live drilldown into per-process disk/network/handle activity. Best for "what is using all my disk right now?" questions.
CPU bottlenecks
Counters that matter:
| Counter | What it tells you |
|---|---|
\Processor(_Total)\% Processor Time | Overall CPU usage. Sustained >80% = bottleneck. |
\System\Processor Queue Length | Threads waiting for CPU. >2 per logical core sustained = saturated. |
\System\Context Switches/sec | >30,000/sec on a 4-core box = heavy thread thrashing. |
\Process(*)\% Processor Time | Per-process; identify the culprit. |
# Quick CPU snapshot
Get-Counter '\Processor(_Total)\% Processor Time' -SampleInterval 1 -MaxSamples 10
# Find the top CPU process right now
Get-Counter '\Process(*)\% Processor Time' |
Select-Object -ExpandProperty CounterSamples |
Where-Object { $_.InstanceName -notmatch '^(_total|idle|system)$' } |
Sort-Object CookedValue -Descending |
Select-Object -First 5 InstanceName, CookedValue
Memory pressure
| Counter | Threshold |
|---|---|
\Memory\Available MBytes | Should stay > 10% of total RAM |
\Memory\Pages/sec | Sustained >1,000 = paging (disk thrash) |
\Memory\Committed Bytes vs \Memory\Commit Limit | Approaching limit = OOM coming |
\Process(*)\Working Set - Private | Per-process true RAM usage |
Get-Counter @(
'\Memory\Available MBytes',
'\Memory\Pages/sec',
'\Memory\Committed Bytes',
'\Memory\Commit Limit'
) -SampleInterval 5 -MaxSamples 12
Disk I/O
| Counter | What it tells you |
|---|---|
\PhysicalDisk(_Total)\% Disk Time | Aggregate; misleading on multi-disk boxes |
\PhysicalDisk(N C:)\Avg. Disk sec/Read | Per-IO latency; HDD >20ms or SSD >5ms = slow |
\PhysicalDisk(N C:)\Avg. Disk Queue Length | Sustained >2 per spindle = saturated |
\PhysicalDisk\Disk Bytes/sec | Throughput in bytes |
Get-Counter @(
'\PhysicalDisk(*)\Avg. Disk sec/Read',
'\PhysicalDisk(*)\Avg. Disk sec/Write',
'\PhysicalDisk(*)\Disk Bytes/sec',
'\PhysicalDisk(*)\Avg. Disk Queue Length'
) -SampleInterval 1 -MaxSamples 30
Convert Avg. Disk sec/Read to ms by multiplying by 1000. The threshold of 20ms is for spinning disks; modern NVMe SSDs should be under 1ms.
Network
| Counter | What it tells you |
|---|---|
\Network Interface(*)\Bytes Total/sec | Throughput per NIC |
\Network Interface(*)\Output Queue Length | Sustained >2 = NIC saturation |
\TCPv4\Connections Established | Active TCP sessions |
\Network Interface(*)\Packets Outbound Errors | Should always be 0 |
Get-Counter @(
'\Network Interface(*)\Bytes Total/sec',
'\Network Interface(*)\Output Queue Length',
'\TCPv4\Connections Established'
) -SampleInterval 1 -MaxSamples 10
Per-process counters
# Top 10 by working set (true RAM)
Get-Process | Sort-Object WS -Descending | Select-Object -First 10 Name, Id, @{n='WS_MB';e={[int]($_.WS/1MB)}}, CPU
# Top 10 by handle count (handle leak detection)
Get-Process | Sort-Object HandleCount -Descending | Select-Object -First 10 Name, Id, HandleCount
# Process I/O counters
Get-Process | Sort-Object IO_ReadBytes -Descending | Select-Object -First 5 Name,
@{n='Read_MB';e={[int]($_.IO_ReadBytes/1MB)}},
@{n='Write_MB';e={[int]($_.IO_WriteBytes/1MB)}}
Long-running captures (Data Collector Set)
For "the server slows down every night at 2 AM" investigations, you need a long-running capture into a binary log (.blg):
# Create + start a one-hour capture, 1 sample/sec
$cs = "PerfTriage-$(Get-Date -Format yyyyMMdd-HHmm)"
logman create counter $cs `
-c "\Processor(_Total)\% Processor Time" `
"\Memory\Available MBytes" `
"\PhysicalDisk(*)\Avg. Disk sec/Read" `
"\PhysicalDisk(*)\Avg. Disk Queue Length" `
"\Network Interface(*)\Bytes Total/sec" `
-si 1 -f bin -o "C:\PerfLogs\$cs.blg" -max 500
logman start $cs
# ... wait one hour ...
logman stop $cs
logman delete $cs
Analyzing the .blg file
# Convert binary log to CSV for Excel / scripting
relog "C:\PerfLogs\PerfTriage-20260417-1400.blg" -f CSV -o triage.csv
# Or open in PerfMon GUI
perfmon.exe /sys "C:\PerfLogs\PerfTriage-20260417-1400.blg"
In PerfMon, the magic shortcut is highlight: select a counter and press Ctrl+H to make its line stand out from the rest.
Cheat sheet
Every counter, every threshold on a single PDF: Windows Performance Cheat Sheet.
FAQ
Is Get-Counter slower than perfmon?
Slightly, because PowerShell wraps the same WMI/PDH API. For long captures, use logman; for ad-hoc, Get-Counter is fine.
Why is my Avg. Disk Queue Length always >1?
On modern SSDs and especially NVMe, queue length is a less useful metric โ those drives are designed for high concurrency. Look at Avg. Disk sec/Read latency instead.
What does "% Processor Time = 100%" mean on a hyper-threaded CPU?
It means all logical cores (HT included) are saturated. On most workloads, hyper-threading gives a 20-30% boost, so a "100% busy" 4-core/8-thread CPU is genuinely out of CPU.
Resource Monitor shows process X using 90% disk โ Task Manager doesn't. Which is right?
Resource Monitor counts in/out bytes per process and is more accurate for I/O. Task Manager's "Disk %" is a derived metric that smooths over short bursts.
How long should I run a Data Collector Set?
Long enough to capture the bad behaviour. For a hourly slowdown, 2 hours. For a nightly batch, the whole night. -max 500 caps the file at 500 MB; raise for very long runs.
Can I run a DCS remotely?
Yes โ logman create counter ... -s remote-server. Requires admin on the remote.
What about ETW for deeper traces?
Event Tracing for Windows (ETW) is the layer below performance counters and is what tools like PerfView use. For most "is the server slow" questions, counters are enough โ reach for ETW only when you need per-function CPU profiling.