The Linux kernel exposes hundreds of tunables under /proc/sys, and the defaults are conservative compromises designed to boot on a 1998 laptop. Production servers can โ and should โ depart from those defaults. This guide focuses on the sysctl settings that pay off on every Linux server: TCP stack hardening, network buffer sizing, virtual-memory behavior, and kernel-level exploit mitigation.
How sysctl actually applies settings
Three layers exist: runtime (sysctl -w), persistent (/etc/sysctl.conf and /etc/sysctl.d/*.conf), and boot-time (kernel command line). Always prefer drop-in files over editing the main file:
sudo install -m 644 /dev/null /etc/sysctl.d/99-hardening.conf
sudo $EDITOR /etc/sysctl.d/99-hardening.conf
sudo sysctl --system # reload everything in order
Files are loaded in lexical order, so 99- wins over distribution defaults. Confirm a value with sysctl net.ipv4.tcp_syncookies.
Network hardening (every server)
# /etc/sysctl.d/99-net-hardening.conf
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.log_martians = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv6.conf.all.accept_redirects = 0
net.ipv6.conf.all.accept_source_route = 0
kernel.kptr_restrict = 2
kernel.dmesg_restrict = 1
kernel.unprivileged_bpf_disabled = 1
net.core.bpf_jit_harden = 2
Each line closes a specific class of attack: SYN floods, source-routed packets, ICMP redirects that re-route your traffic, kernel pointer disclosure via /proc, and unprivileged eBPF abuse.
TCP performance for high-throughput hosts
The defaults are tuned for a 1 Mbit modem. On a 10 Gbit NIC with multi-second RTT to remote regions, you are leaving 80% of your bandwidth on the table:
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq
net.ipv4.tcp_mtu_probing = 1
net.ipv4.tcp_fastopen = 3
BBR + fq is the modern recommended pair (Linux 4.9+). For database workloads with many short-lived connections, also consider net.ipv4.tcp_tw_reuse=1 and increase net.ipv4.ip_local_port_range.
Virtual memory and dirty pages
Default vm.swappiness=60 is too aggressive for servers with SSDs or for database hosts that should never swap. For a database server with abundant RAM:
vm.swappiness = 10
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.dirty_expire_centisecs = 3000
vm.overcommit_memory = 2
vm.overcommit_ratio = 80
Lower dirty ratios reduce write-stall amplitude under bursty I/O. overcommit_memory=2 prevents the OOM killer from arriving in the middle of a transaction by failing allocations early; tune overcommit_ratio based on RAM + swap headroom you can afford.
File descriptors and connection limits
A busy reverse proxy or message broker quickly hits the system-wide file-descriptor cap. Raise it before the first complaint, not after:
fs.file-max = 2097152
fs.nr_open = 1048576
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.netfilter.nf_conntrack_max = 1048576
Do not forget the per-process limit (/etc/security/limits.d/) and the systemd unit limit (LimitNOFILE=) โ sysctl alone is not enough.
Validating and reverting
Apply settings to the running kernel without reboot, then validate against a known-good benchmark:
sudo sysctl --system
sudo sysctl -a | diff - /tmp/sysctl-baseline.txt
sudo sysctl -p /etc/sysctl.d/99-hardening.conf # single file
If a value breaks something, revert by deleting the drop-in file and re-running --system. For real safety, take a baseline before any change: sudo sysctl -a > /tmp/sysctl-baseline.txt.
Compliance presets
If you operate under CIS, PCI-DSS, or STIG, treat the published benchmark as the floor, not the ceiling. The CIS Linux Benchmark documents every recommended sysctl with rationale. Use oscap with the SCAP Security Guide to score and remediate automatically.
Common pitfalls
- Setting
vm.swappiness=0entirely โ modern kernels treat 0 as "OOM kill before swap"; use1instead if you really mean it. - Cargo-culting a TCP buffer size from a blog without measuring BDP for your actual link.
- Forgetting that container runtimes inherit host sysctls; some need explicit
--sysctlat start. - Leaving stale entries in both
/etc/sysctl.confand a drop-in file โ the drop-in wins, the conf entry misleads the next admin.
A 30-line sysctl drop-in file, deployed via configuration management, hardens every Linux box you operate. Audit, baseline, document, and treat kernel parameters with the same rigor as application config โ it is the cheapest security and performance improvement you will ever make.