If you're running servers without monitoring, you're flying blind. Prometheus collects metrics, Grafana visualizes them, and Alertmanager notifies you when things go wrong. Here's how to set up all three.
Docker Compose Deployment
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alert.rules.yml:/etc/prometheus/alert.rules.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.retention.time=30d'
restart: unless-stopped
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=your-secure-password
- GF_INSTALL_PLUGINS=grafana-clock-panel
volumes:
- grafana_data:/var/lib/grafana
restart: unless-stopped
node-exporter:
image: prom/node-exporter:latest
ports:
- "9100:9100"
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
restart: unless-stopped
alertmanager:
image: prom/alertmanager:latest
ports:
- "9093:9093"
volumes:
- ./alertmanager.yml:/etc/alertmanager/alertmanager.yml
restart: unless-stopped
volumes:
prometheus_data:
grafana_data:
Essential PromQL Queries
# CPU usage percentage
100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
# Memory usage percentage
(1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100
# Disk usage percentage
(1 - node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100
# Network received bytes/sec
rate(node_network_receive_bytes_total{device="eth0"}[5m])
# HTTP request rate
rate(http_requests_total[5m])
# HTTP error rate (5xx)
rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) * 100
# 95th percentile response time
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
Alert Rules
# alert.rules.yml
groups:
- name: server_alerts
rules:
- alert: HighCPUUsage
expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage detected"
description: "CPU usage is above 80% for 5 minutes"
- alert: HighMemoryUsage
expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
for: 5m
labels:
severity: critical
- alert: DiskSpaceLow
expr: (1 - node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 > 85
for: 10m
labels:
severity: warning
📘 Monitor Like a Pro
Our DevOps monitoring eBooks cover Prometheus, Grafana, ELK Stack, and production alerting strategies.
Browse Monitoring Books →A properly configured monitoring stack pays for itself the first time it catches an issue before your users do. Start with the basics — CPU, memory, disk — then add application-specific metrics as your system grows.