For two decades the runtime-security pattern on Linux was the same: ship audit logs, syslog, and an endpoint agent's events to a central place, then write detection rules that fire minutes or hours after the fact. eBPF changed the contract. The kernel can now run small, verified programs at syscall, network and tracepoint hooks, decide in microseconds whether the activity is suspicious, and either emit an event or block the action outright. Two open-source projects dominate the space: Falco, the CNCF-graduated detection engine, and Tetragon, the Cilium-led observability and enforcement project. They are not interchangeable, and the choice has real operational consequences. This guide compares them honestly, shows the deployment patterns that survive contact with production, and ships a free PDF cheat sheet of the rules and commands you will reuse weekly.
Table of Contents
Why eBPF changed runtime security
The traditional Linux security stack - auditd, kernel modules, LSM hooks - is either too coarse, too fragile, or too slow to evolve. auditd writes everything to disk and forces a parser to make sense of it later. Out-of-tree kernel modules need a maintainer per kernel version and break on every distribution upgrade. eBPF threads the needle: programs are loaded, verified for safety, and attached to hooks at runtime; they execute in kernel context with bounded time and memory; and they can both observe and (with the right hook type) modify or deny the action that triggered them. The result is a detection layer that runs at syscall speed, scales across thousands of hosts, and ships rule updates without a kernel reboot or a daemon restart.
For a security team, the practical promise is three things. First, visibility into what processes actually do - the exec, the open, the connect, the privilege change - rather than what they are configured to do. Second, kernel-time enforcement on the hooks that support it - kill a process the moment it tries to read /etc/shadow from a webserver workload, before the read returns. Third, portability across Linux distributions via CO-RE (Compile Once, Run Everywhere), which removes the per-kernel-version build matrix that made earlier eBPF tooling painful to operate.
Falco - what it is good at
Falco is a detection engine. Its job is to ingest kernel events through an eBPF or kernel-module driver, evaluate them against a rule library, and emit alerts. The rules are written in YAML, the engine ships hundreds of community rules out of the box (CIS-aligned, MITRE-tagged, Kubernetes-aware), and the output plugs into every common SIEM via stdout, gRPC, syslog or HTTP.
What Falco does very well: signal generation. The default ruleset catches the textbook attack patterns - shells spawned in containers, sensitive file reads, unexpected outbound connections - with a low false-positive rate after a week of baseline tuning. The macro/list system lets you express "any process that is not on the allowlist" or "any port that is not in the documented set" without restating the same condition fifty times. Output formatters produce ECS-shaped JSON that drops into Elasticsearch, Splunk or Loki without translation.
What Falco does not do: enforcement. It alerts, it does not block. The official guidance is to pair Falco with a response system - falco-talon, a SOAR runbook, or a Kubernetes admission controller - that consumes Falco events and takes action. That separation is intentional and architecturally clean; it is also one more moving piece to operate.
Tetragon - what it is good at
Tetragon, from the Cilium project, is built around tracing policies: declarative descriptions of which kernel events to watch, what context to attach (process tree, capabilities, namespaces, file paths), and what action to take. The actions include the usual log-and-emit, but Tetragon also supports SIGKILL and Override at the kernel hook itself - the syscall returns an error before the action completes. That in-kernel enforcement is the headline differentiator.
What Tetragon does very well: deep process and network observability with minimal user-space overhead. Events ship as structured protobufs over gRPC, the JSON encoding is verbose but easy to query, and Hubble integration gives you a Kubernetes-aware view of every connection and exec across the cluster. The same engine that does observability does enforcement, so you do not need a second tool to take action on a finding.
What Tetragon does not do: ship a deep rule library out of the box. Tetragon's authoring model is closer to "build the policies you need" than "filter the noise from a generic rule pack". For teams that want fewer pieces and tighter Cilium integration that trade-off is fine; for teams that want a lot of detections from day one, Falco's library is a better starting point.
Side-by-side comparison
The two projects overlap in eBPF heritage and diverge in everything operational. The honest comparison:
- Detection model: Falco - YAML rules, large community library. Tetragon - tracing policies, smaller starter set, deeper context.
- Enforcement: Falco - alert only, response via integrations. Tetragon - in-kernel kill / override on supported hooks.
- Output: Falco - stdout/gRPC/syslog/HTTP, ECS-friendly JSON. Tetragon - gRPC + JSON, Hubble UI in Kubernetes.
- Kubernetes posture: Falco - works on any cluster, no CNI dependency. Tetragon - works best alongside Cilium, also runs standalone.
- Operational maturity: Falco - CNCF graduated, large operator community, plenty of public detection content. Tetragon - CNCF incubating, fast-moving, growing rule and policy library.
- Performance: Both are sub-percent CPU at typical event rates; Tetragon's in-kernel filtering is slightly leaner under heavy load because fewer events cross to user space.
The decision tree is short. If the requirement is "detect and alert across a heterogeneous Linux fleet, hand events to the SIEM, run on any kernel and any cluster" - Falco. If the requirement is "detect, enforce, and live inside a Cilium-based Kubernetes platform" - Tetragon. Many mature platforms run both: Falco for the breadth of detections, Tetragon for the targeted blocking policies.
Production deployment patterns
For Falco the canonical deployment is a DaemonSet (Kubernetes) or a systemd unit (bare metal/VMs) with the modern eBPF driver enabled and the kernel-module fallback disabled. Pull the chart from falcosecurity/falco, set driver.kind=modern_ebpf, mount the host's /proc, /var/run and /etc read-only, and ship events via the gRPC output to a Falcosidekick instance that fans out to the SIEM. Run two instances of falcosidekick behind a service to avoid making it the single point of event loss.
For Tetragon the pattern is similar - a DaemonSet, the daemon attached to kernel hooks, gRPC output to a collector. The Cilium chart (cilium/tetragon) is the easiest path. Start with the bundled tracing-policy library (file-monitoring, credential-access, network-egress) and add custom policies as the threat model demands. Hubble Relay aggregates events across the cluster and is what your analysts will actually use for ad-hoc queries.
On bare-metal or VM fleets without Kubernetes the picture is similar but ops-heavier: package the agent, ship via your configuration management (Ansible, Puppet, Chef), centralise logs to a SIEM. Both projects have well-tested DEB and RPM packages; both honour the same eBPF capability requirements (CAP_BPF, CAP_PERFMON on modern kernels, CAP_SYS_ADMIN on older ones).
Writing detections that survive a noisy fleet
The hardest part of any runtime-security deployment is not the install - it is the first month of tuning. The default rule set will fire on legitimate ops activity (configuration management, package installs, scheduled jobs) and quickly exhaust analyst attention if not curated.
The discipline that works: ship the agent in silent mode for one week (events go to a tuning channel, not the SIEM). Mine the events for the top noise sources - Ansible, Puppet, the package manager, your monitoring agent. For each, write a precise allowlist exception (by binary path + parent process, not by name only - attackers rename binaries). After a week the noise drops by an order of magnitude and the remaining events are worth analyst time.
For Falco the exception primitive is the exceptions field on a rule plus the condition macros. For Tetragon it is the matchActions filtering plus a NotIn selector against your allowlist. Both are easy to misuse; both have the same rule of thumb: prefer to exclude specific binaries with specific parent processes from a specific rule rather than deleting the rule outright. Deleting rules to silence noise is how detection coverage decays invisibly.
Kubernetes specifics
In Kubernetes both projects light up because container metadata enriches every event - pod name, namespace, labels, image hash, node. That context is the difference between "process X opened /etc/shadow" and "the api-gateway pod in production-payments opened /etc/shadow". Make sure the agent has access to the kubelet API or the container runtime socket; without it you lose the enrichment and most of the value.
Two Kubernetes-specific patterns are worth copying. First, policy per workload class rather than per pod - a single policy "no shells in any production pod" written once is more maintainable than per-deployment exceptions. Second, admission controller alignment - if Falco fires on a behaviour, OPA/Gatekeeper or Kyverno should refuse to admit a pod that requires that behaviour. Detection without prevention drifts; prevention without detection is brittle. Pair them.
Performance and overhead
Real-world numbers from production Falco and Tetragon deployments on c5.large and c6i.xlarge instances under typical web/application workloads: sub-1% steady-state CPU, 50-200 MB resident memory per agent, no measurable application latency impact. Heavy event-rate workloads (build farms, batch ETL) push CPU into the 1-3% range; storage-heavy workloads (database hosts) sometimes hit 5% on the open/openat hook. If overhead matters, prefer Tetragon's in-kernel filtering: fewer events cross to user space, less protobuf serialisation, lower CPU.
The most common performance mistake is leaving the noisy default rules on after the tuning window. Rules that fire ten thousand times per second do not get processed any faster by adding CPU - the engine still serialises every event. Curate the rule set first, then measure overhead, then add hardware if needed.
Common pitfalls
- Running the kernel-module driver on a modern kernel. The eBPF driver is faster, more portable and is now the default. Switch off the kernel module unless you are stuck on an EOL kernel.
- Logging at the agent without forwarding. A local
/var/log/falco.logis useful for development and useless for response. Ship events off-host immediately. - Allowlisting by binary name only. Attackers rename binaries; exceptions should match on path plus parent process plus namespace where possible.
- Tetragon enforcement without a deny audit period. Run any new enforcement policy in
Auditmode for at least a week before flipping toEnforce. Otherwise the first false-positive kills production traffic. - Skipping the SIEM-side de-duplication. Both engines can emit the same event class many times for one root cause. Deduplicate by process tree on the SIEM side, not by raising the rule threshold.
Audit checklist
- eBPF driver in use (modern_ebpf for Falco, default for Tetragon) (1 pt)
- Events forwarded off-host within 60 seconds at p95 (1 pt)
- Tuning exceptions live in source control with author + reason (1 pt)
- At least the CIS / MITRE-tagged baseline rules enabled (1 pt)
- Tetragon enforcement policies passed an audit period before going live (or N/A) (1 pt)
5/5 = PASS, 3-4 = WARN, <3 = FAIL.
FAQ
Can I run Falco and Tetragon on the same host?
Yes. The eBPF programs attach to different hooks and the verifier handles overlap. Expect 1-2% additional CPU; the engineering value is having both detection breadth (Falco) and kernel-time enforcement (Tetragon) without picking one.
Do I still need auditd?
For most threat models, no. eBPF-based monitoring covers the same syscalls with richer context, lower overhead and better forwarding. Keep auditd if a compliance regime explicitly requires it; otherwise retire it after a parallel-run period.
What kernel version do I need?
5.4+ is the practical floor; 5.10+ is comfortable; 5.15+ unlocks the modern_ebpf driver and CO-RE everywhere. RHEL 9 / Ubuntu 22.04 / Debian 12 are all fine.
How do I integrate with the SIEM?
Falco -> Falcosidekick -> HTTP/gRPC/syslog to your SIEM. Tetragon -> gRPC -> collector -> SIEM. Both produce JSON that maps cleanly to ECS or your in-house schema.
Will it block legitimate ops work?
Only if you put enforcement policies into Enforce mode without auditing. The recommended ramp is two weeks of Audit per policy, mining the events for false positives, then promoting to Enforce.