If you have ever watched a PowerShell script crawl through a list of 200 servers one at a time, you already know the answer: there has to be a better way. Since PowerShell 7.0, there is. The -Parallel parameter on ForEach-Object turns a serial loop into a fan-out across multiple runspaces โ and on the right workload, it cuts runtime from minutes down to seconds.
This guide is the practical, production version of that feature. We will cover what -Parallel actually does under the hood, the $using: scope rule that trips up everyone the first time, throttling, error handling, when to use it (and when not to), and a benchmark that shows the real-world numbers. By the end you will be able to drop -Parallel into your scripts with confidence โ and a free PDF cheat sheet to keep next to your monitor.
Table of Contents
What ForEach-Object -Parallel actually does
When you add -Parallel to ForEach-Object, PowerShell does not give you classic threading. Instead, it spins up multiple PowerShell runspaces โ independent execution environments inside the same process โ and dispatches each input object to one of them. Each runspace has its own session state, its own variable scope, and its own copy of the modules you import inside the script block.
That is a critical mental model. A runspace is not a thread you can pass mutable state into; it is closer to a small, isolated PowerShell session. The host process pools and recycles runspaces between iterations, so the cost of creating one is amortized โ you do not pay it per item.
The practical consequence: -Parallel is excellent for I/O-bound workloads (HTTP requests, remote PowerShell calls, file copies, DNS lookups, AD queries) and weak for CPU-bound work that already pegs a single core. We will see why in the benchmarks section.
Basic syntax
The smallest useful example: ping 50 hosts in parallel and collect the round-trip times.
$hosts = 1..50 | ForEach-Object { "192.168.1.$_" }
$results = $hosts | ForEach-Object -Parallel {
$h = $_
$ping = Test-Connection -ComputerName $h -Count 1 -Quiet -TimeoutSeconds 1
[PSCustomObject]@{
Host = $h
Up = $ping
}
} -ThrottleLimit 20
$results | Where-Object Up -eq $true
Three things to notice. First, the script block looks identical to a normal ForEach-Object โ same $_, same emit-objects-to-the-pipeline pattern. Second, -ThrottleLimit 20 caps how many runspaces run simultaneously (default is 5). Third, the script block has its own scope: variables you defined outside it are not available inside unless you opt in.
The $using: scope rule
This is the rule that breaks every first script. Inside a -Parallel script block, outer-scope variables are not visible. You have to mark them with $using::
$apiKey = "secret-token"
$baseUrl = "https://api.example.com"
$ids = 1..100
$ids | ForEach-Object -Parallel {
$id = $_
# WRONG: $apiKey and $baseUrl are $null here
# RIGHT: prefix with $using:
$headers = @{ Authorization = "Bearer $($using:apiKey)" }
$url = "$($using:baseUrl)/items/$id"
Invoke-RestMethod -Uri $url -Headers $headers
} -ThrottleLimit 10
The $using: prefix copies the value of the outer variable into each runspace at dispatch time. It is a one-way snapshot โ modifying $using:apiKey inside the parallel block does not change the outer variable, and the change is not visible to other iterations either.
Throttle limit and CPU sizing
The default -ThrottleLimit is 5. That is a safe number for accidentally launching the cmdlet against a million-row dataset, but it leaves a lot of throughput on the table for I/O-heavy workloads. A few rules of thumb:
- HTTP / REST API calls: 10 to 50 is usually safe. Watch the upstream rate limit, not your CPU.
- Remote PowerShell (Invoke-Command): 20 to 100 across many hosts. Each session uses kernel handles on the target โ be polite.
- Local file I/O on SSD: 4 to 8. More than that and the disk queue thrashes.
- CPU-bound work: match your physical core count, not logical. Hyper-threading helps less than people think for PowerShell.
Read the actual core count programmatically and pick a sensible default:
$cores = (Get-CimInstance Win32_Processor | Measure-Object -Property NumberOfCores -Sum).Sum
$throttle = [Math]::Max(4, $cores)
$items | ForEach-Object -Parallel { ... } -ThrottleLimit $throttle
Error handling and -ErrorAction
By default, errors inside a parallel block do not stop the loop. Each runspace catches its own terminating errors and surfaces them as non-terminating errors on the outer pipeline. That means your script keeps running and a failure on item 47 will not abort items 48 to 100.
You usually want one of two patterns:
# Pattern A: collect successes and failures separately
$results = $items | ForEach-Object -Parallel {
try {
$r = Invoke-RestMethod $_.Url -ErrorAction Stop
[PSCustomObject]@{ Item = $_; Status = 'ok'; Data = $r }
}
catch {
[PSCustomObject]@{ Item = $_; Status = 'err'; Error = $_.Exception.Message }
}
} -ThrottleLimit 20
$ok = $results | Where-Object Status -eq 'ok'
$bad = $results | Where-Object Status -eq 'err'
# Pattern B: hard fail the whole job on the first error
$items | ForEach-Object -Parallel {
Invoke-RestMethod $_.Url
} -ThrottleLimit 20 -ErrorAction Stop
Pattern A is almost always what you want in production โ you keep the run going, then sort out the failures at the end.
Sharing state between runspaces
Plain variables do not work across runspaces. Concurrent collections do. The standard tool is a System.Collections.Concurrent.ConcurrentDictionary or ConcurrentBag:
$bag = [System.Collections.Concurrent.ConcurrentBag[psobject]]::new()
$items | ForEach-Object -Parallel {
$local = $using:bag
$r = Get-SomeData $_
$local.Add($r)
} -ThrottleLimit 16
# After the loop, $bag has every result
$bag | Format-Table
Plain $results = $items | ForEach-Object -Parallel { ... } already collects the pipeline output for you, so you only need a concurrent collection when you want to mutate shared state from inside (counters, deduplication sets, caches).
Per-iteration timeouts
The -TimeoutSeconds parameter (PowerShell 7.1+) caps the entire parallel job, not each iteration. If you want a per-iteration timeout, build it into the script block with a job:
$items | ForEach-Object -Parallel {
$job = Start-Job -ScriptBlock {
param($x) Invoke-RestMethod "https://slow-api/$x"
} -ArgumentList $_
if (Wait-Job $job -Timeout 30) {
Receive-Job $job
} else {
Stop-Job $job
Write-Warning "Item $_ timed out"
}
Remove-Job $job -Force
} -ThrottleLimit 10
Benchmarks: serial vs parallel
Numbers always beat opinions. Here is the runtime for the same 100 HTTPS GET requests, against a public API with ~200ms latency, on a 6-core laptop:
Measure-Command {
1..100 | ForEach-Object {
Invoke-RestMethod "https://httpbin.org/delay/0.2" | Out-Null
}
}
# 100 requests, serial: ~22 seconds
Measure-Command {
1..100 | ForEach-Object -Parallel {
Invoke-RestMethod "https://httpbin.org/delay/0.2" | Out-Null
} -ThrottleLimit 20
}
# 100 requests, ThrottleLimit 20: ~1.4 seconds
# ~16x speedup
For pure CPU work the picture is different. A function that does a million math operations in a tight loop barely benefits past the physical core count, and the overhead of marshalling input objects across runspaces can even make it slower for very small per-item workloads.
When NOT to use -Parallel
- Tiny per-item work. If each iteration takes < 5 ms, runspace overhead dominates and serial is faster.
- Order matters. Output order from
-Parallelis not guaranteed. Sort afterwards if needed. - Mutating outer-scope state without a concurrent collection. You will lose updates and race against yourself.
- Workload that hammers a fragile downstream. Twenty parallel database INSERTs against a tiny SQLite file is a great way to corrupt it.
- You are still on Windows PowerShell 5.1.
-Parallelrequires PowerShell 7.0 or later. On 5.1 you have to fall back toStart-Job, runspace pools, or third-party modules like PoshRSJob.
Common patterns
Fan-out remoting against many servers
$servers = Get-Content .\servers.txt
$servers | ForEach-Object -Parallel {
$s = $_
try {
$os = Invoke-Command -ComputerName $s -ScriptBlock { (Get-CimInstance Win32_OperatingSystem).Caption } -ErrorAction Stop
[PSCustomObject]@{ Server = $s; OS = $os; Ok = $true }
} catch {
[PSCustomObject]@{ Server = $s; OS = $null; Ok = $false; Err = $_.Exception.Message }
}
} -ThrottleLimit 30 | Export-Csv .\inventory.csv -NoTypeInformation
Bulk file hashing
Get-ChildItem C:\Data -Recurse -File |
ForEach-Object -Parallel {
[PSCustomObject]@{
Path = $_.FullName
Hash = (Get-FileHash $_.FullName -Algorithm SHA256).Hash
}
} -ThrottleLimit 8 | Export-Csv .\hashes.csv -NoTypeInformation
Concurrent REST API enrichment
$users = Import-Csv .\users.csv
$enriched = $users | ForEach-Object -Parallel {
$u = $_
$extra = Invoke-RestMethod "https://api.example.com/users/$($u.Id)"
$u | Add-Member NoteProperty Department $extra.dept -PassThru
} -ThrottleLimit 25
Free cheat sheet
Grab the PowerShell Parallel Cheat Sheet (PDF) โ syntax, $using: rules, throttle sizing, error patterns, and the benchmarks above on one printable reference.
FAQ
Does -Parallel work in Windows PowerShell 5.1?
No. The -Parallel parameter is a PowerShell 7.0+ feature. On 5.1 you need Start-Job (heavy), Start-ThreadJob from the ThreadJob module (lighter), or a third-party module like PoshRSJob.
Can I use Write-Host inside a parallel block?
Yes, but the output ordering is non-deterministic. For status output, prefer emitting objects to the pipeline and post-processing.
How do I share a counter across runspaces?
Use a thread-safe primitive. [System.Threading.Interlocked]::Increment([ref]$counter) works if you wrap the counter in a single-element array, or use a ConcurrentDictionary with atomic operations.
Why are my modules missing inside the script block?
Each runspace starts with a clean session state. Either Import-Module at the top of the script block, or pre-import the module into the runspace pool (advanced; rarely needed).
Is -Parallel the same as parallel pipelines in Bash with xargs -P?
Conceptually similar โ both fan out work to a worker pool with a throttle limit. The main differences: PowerShell runspaces are heavier than Unix processes but live inside one .NET process, share GC, and let you marshal real objects between iterations rather than text streams.
Is there a memory cost?
Yes. Each active runspace holds its own session state, loaded modules, and any captured $using: variables. With -ThrottleLimit 50 and a heavy module loaded, peak memory can climb fast. Profile if you push throttle high.
Can I cancel a parallel job mid-flight?
Use -AsJob to get a real job back, then Stop-Job. Without -AsJob, Ctrl+C from the host stops the pipeline but in-flight runspaces may take a moment to wind down.