Microsoft Defender for Endpoint (MDE) ships with a query interface that most analysts never open. The Advanced Hunting blade is a Kusto Query Language (KQL) console over 30 days of process, network, file, registry and identity telemetry from every onboarded device. The data is genuinely good - parent process, command line (decoded), file hash, signer, network destination - and the query language is fast enough that "search every process exec across the fleet for the last 30 days" returns in seconds. The reason adoption is low is that the query examples in the official docs are short and abstract; a working hunter wants a recipe book. This guide is that recipe book - twelve KQL queries that catch real adversary behaviour on Windows endpoints, the structure for a saved-query library that survives turnover, and a free PDF cheat sheet of the exact KQL.
Table of Contents
- Why KQL hunting is the highest-leverage use of MDE
- The tables you actually need
- The query shape that scales
- Living-off-the-land binaries
- Suspicious parent-child relationships
- Lateral movement signals
- Persistence
- Joining with Entra ID and identity tables
- Building a saved-query library
- Common pitfalls
- Audit checklist
- FAQ
Why KQL hunting is the highest-leverage use of MDE
MDE detection rules are good but conservative - Microsoft tunes them for low false-positive rates across millions of customers. Real attacks against your environment have specifics: the threat actor known to target your sector, the tools your team has historically struggled to harden, the behaviours your apps legitimately exhibit (so you can exclude them from a hunt). KQL hunting closes that gap. A SOC analyst who can write fifteen lines of KQL can find behaviour the built-in detections miss, validate or dismiss alerts in seconds, and build saved queries that turn into the next generation of custom detections. The license already pays for the data; the only cost is the queries.
The tables you actually need
MDE exposes around 30 tables; in practice you spend 80% of your time in five:
DeviceProcessEvents- every process exec across every device. Parent, child, command line, hash, signer, account.DeviceFileEvents- file create/modify/delete with the responsible process.DeviceNetworkEvents- outbound connections with process, destination IP, port, URL.DeviceRegistryEvents- registry writes with key, value, data, responsible process.DeviceLogonEvents- logon and logoff with type (interactive, network, RDP), source, account.
Three more come up regularly: DeviceImageLoadEvents for DLL injection hunting, IdentityLogonEvents for the Defender for Identity overlay, and EmailEvents if Defender for O365 is licensed. Learn the schema of the five core tables and you can express most hunts.
The query shape that scales
A maintainable hunting query has a predictable shape: filter to time and tenant scope, project the columns you care about, summarise or join to add context, sort, limit. KQL pipes (|) chain operators left-to-right; the pattern that works:
DeviceProcessEvents
| where Timestamp > ago(24h)
| where DeviceName matches regex @"^prod-" // narrow scope first
| where ProcessVersionInfoCompanyName != "Microsoft Corporation"
| project Timestamp, DeviceName, AccountName, ProcessCommandLine, FolderPath, FileName, InitiatingProcessFileName
| sort by Timestamp desc
| take 100
Always filter on Timestamp first - it is the partition key, every other filter is cheap once the time window is narrowed. Always project early - hunting queries are fast when they carry few columns, slow when they carry the schema's full breadth.
Living-off-the-land binaries
The single most useful hunt: signed Microsoft binaries doing things that signed Microsoft binaries should not do. The LOLBin list (lolbas-project.github.io) gives you the population.
// certutil downloading from the internet
DeviceProcessEvents
| where Timestamp > ago(7d)
| where FileName =~ "certutil.exe"
| where ProcessCommandLine has_any ("urlcache", "verifyctl", "split")
| where ProcessCommandLine has_any ("http://", "https://")
| project Timestamp, DeviceName, AccountName, ProcessCommandLine, InitiatingProcessFileName
| sort by Timestamp desc
// rundll32 with suspicious args
DeviceProcessEvents
| where Timestamp > ago(7d)
| where FileName =~ "rundll32.exe"
| where ProcessCommandLine has_any ("javascript:", "vbscript:", ",DllRegisterServer", "shell32.dll,Control_RunDLL")
| where ProcessCommandLine !has_any ("\\Program Files\\", "\\Program Files (x86)\\")
| project Timestamp, DeviceName, AccountName, ProcessCommandLine, InitiatingProcessFileName, FolderPath
// mshta launched outside expected paths
DeviceProcessEvents
| where Timestamp > ago(7d)
| where FileName =~ "mshta.exe"
| where InitiatingProcessFileName !in~ ("explorer.exe", "winlogon.exe")
| project Timestamp, DeviceName, AccountName, ProcessCommandLine, InitiatingProcessFileName, FolderPath
Suspicious parent-child relationships
Many initial-access patterns leave the same parent-child fingerprint - an Office app spawning a script interpreter, a browser spawning an installer, a service host spawning a shell.
// Office apps spawning script interpreters
DeviceProcessEvents
| where Timestamp > ago(7d)
| where InitiatingProcessFileName in~ ("winword.exe", "excel.exe", "powerpnt.exe", "outlook.exe")
| where FileName in~ ("powershell.exe", "wscript.exe", "cscript.exe", "cmd.exe", "rundll32.exe", "regsvr32.exe", "mshta.exe")
| project Timestamp, DeviceName, AccountName, InitiatingProcessFileName, FileName, ProcessCommandLine
| sort by Timestamp desc
// Browser spawning installers / scripts
DeviceProcessEvents
| where Timestamp > ago(7d)
| where InitiatingProcessFileName in~ ("chrome.exe", "msedge.exe", "firefox.exe")
| where FileName in~ ("powershell.exe", "wscript.exe", "cscript.exe", "msiexec.exe", "installer.exe", "setup.exe")
| project Timestamp, DeviceName, AccountName, InitiatingProcessFileName, FileName, ProcessCommandLine, FolderPath
Lateral movement signals
Three queries cover the bulk of post-compromise lateral movement: WMI/PSExec service creation, scheduled task creation on a remote host, and unusual outbound SMB.
// Service created by remote PSExec-style activity
DeviceProcessEvents
| where Timestamp > ago(7d)
| where FileName =~ "services.exe"
| where InitiatingProcessFileName =~ "services.exe"
| where ProcessCommandLine has_any ("PSEXESVC", "PAExec", "remcom")
| project Timestamp, DeviceName, AccountName, ProcessCommandLine, InitiatingProcessFileName
// Scheduled task created remotely (schtasks /create /s)
DeviceProcessEvents
| where Timestamp > ago(7d)
| where FileName =~ "schtasks.exe"
| where ProcessCommandLine has_any ("/create", "/s ", "/u ")
| project Timestamp, DeviceName, AccountName, ProcessCommandLine, InitiatingProcessFileName, FolderPath
| sort by Timestamp desc
// Outbound SMB to internal hosts from non-File-Explorer
DeviceNetworkEvents
| where Timestamp > ago(24h)
| where RemotePort == 445
| where InitiatingProcessFileName !in~ ("explorer.exe", "system", "lsass.exe", "svchost.exe")
| where RemoteIP matches regex @"^10\.|^192\.168\.|^172\.(1[6-9]|2[0-9]|3[0-1])\."
| summarize Connections=count(), Hosts=dcount(RemoteIP) by DeviceName, InitiatingProcessFileName, AccountName
| where Hosts > 5
| sort by Hosts desc
Persistence
// Run/RunOnce key writes
DeviceRegistryEvents
| where Timestamp > ago(7d)
| where ActionType == "RegistryValueSet"
| where RegistryKey has_any (
@"\Software\Microsoft\Windows\CurrentVersion\Run",
@"\Software\Microsoft\Windows\CurrentVersion\RunOnce",
@"\Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders")
| where InitiatingProcessFileName !in~ ("setup.exe", "msiexec.exe", "explorer.exe", "trustedinstaller.exe")
| project Timestamp, DeviceName, AccountName, RegistryKey, RegistryValueName, RegistryValueData, InitiatingProcessFileName
// New service install via sc.exe
DeviceProcessEvents
| where Timestamp > ago(7d)
| where FileName =~ "sc.exe"
| where ProcessCommandLine has "create"
| where ProcessCommandLine has_any ("binPath", "binpath")
| project Timestamp, DeviceName, AccountName, ProcessCommandLine, InitiatingProcessFileName
Joining with Entra ID and identity tables
The real power shows up when you join endpoint telemetry with identity telemetry:
// Process exec on the device of an account that just signed in from an unusual country
let suspicious = AADSignInEventsBeta
| where Timestamp > ago(2h)
| where Country !in ("HU", "AT", "DE")
| where ResultType == 0
| project AccountUpn, SignInTime=Timestamp, Country, IPAddress;
DeviceProcessEvents
| where Timestamp > ago(2h)
| join kind=inner suspicious on $left.AccountName == $right.AccountUpn
| project Timestamp, DeviceName, AccountName, ProcessCommandLine, SignInTime, Country, IPAddress
This pattern - identity event narrows the scope, endpoint event provides the action - is how you turn a "who logged in from where" question into "who did what after logging in from where".
Building a saved-query library
Saved queries are MDE's way of preserving institutional knowledge. The structure that survives turnover:
- One folder per ATT&CK tactic. Initial Access, Execution, Persistence, Privilege Escalation, Lateral Movement, Exfiltration. Easy to mentally map.
- One query per technique. "T1059 Command and Scripting Interpreter - PowerShell with download cradle" rather than "powershell hunt".
- Standard header on every query. Author, ATT&CK ID, last reviewed date, expected false positives, related custom detection.
- Promote validated queries to custom detections. A query that has fired correctly five times and never wrong gets converted to a custom detection rule that opens an alert; the query stays in the library too.
Common pitfalls
- Querying without a time bound. The default 30-day retention is large; an unbounded query returns slowly and burns quota.
- Using
containswhenhaswould do.hasuses the column index and is much faster thancontainson large tables. - Joining on
DeviceIdwhenDeviceNameis meant. Both exist; mismatched joins return empty. - Forgetting the case-insensitive operators.
=~for equality,has_anyfor substring lists - case mismatches will silently miss matches. - Saved queries with no review date. A two-year-old query whose author left and whose context drifted is a false-positive factory. Add a review date, run it through.
Audit checklist
- SOC has a saved-query library organised by ATT&CK tactic (1 pt)
- Every saved query carries author + last reviewed date (1 pt)
- At least the five LOLBin / parent-child queries above run weekly (1 pt)
- Identity-table joins used for at least two routine hunts (1 pt)
- Validated queries promoted to custom detection rules (1 pt)
5/5 = PASS, 3-4 = WARN, <3 = FAIL.
FAQ
Do I need M365 E5?
Advanced hunting requires Defender for Endpoint Plan 2 (included in M365 E5 / Defender for Endpoint P2 standalone). Plan 1 does not include the data lake.
How long is data retained?
30 days by default. You can export to a Sentinel workspace for longer retention; the same KQL works there.
Can I run KQL from PowerShell?
Yes - via the Microsoft Graph API security/runHuntingQuery endpoint. Useful for scheduled checks and Power Automate flows.
What is the rate limit?
15 calls per minute / 1500 calls per day per tenant for the API; UI hunting has its own quotas. Plan accordingly when scripting.
Do these queries work on Linux MDE?
Most do - the schema is unified and Linux endpoints onboard into the same tables. Some Windows-specific queries (registry, scheduled tasks) obviously do not.