Prometheus + Grafana Monitoring Stack Setup Guid…

Prometheus + Grafana Monitoring Stack: Complete Setup Guide

Dargslan Team | April 12, 2026 | Updated: April 20, 2026 | 5 min read | 80 views

If you're running servers without monitoring, you're flying blind. Prometheus collects metrics, Grafana visualizes them, and Alertmanager notifies you when things go wrong. Here's how to set up all three.

Docker Compose Deployment

version: '3.8'
services:
  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - ./alert.rules.yml:/etc/prometheus/alert.rules.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.retention.time=30d'
    restart: unless-stopped

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=your-secure-password
      - GF_INSTALL_PLUGINS=grafana-clock-panel
    volumes:
      - grafana_data:/var/lib/grafana
    restart: unless-stopped

  node-exporter:
    image: prom/node-exporter:latest
    ports:
      - "9100:9100"
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.sysfs=/host/sys'
    restart: unless-stopped

  alertmanager:
    image: prom/alertmanager:latest
    ports:
      - "9093:9093"
    volumes:
      - ./alertmanager.yml:/etc/alertmanager/alertmanager.yml
    restart: unless-stopped

volumes:
  prometheus_data:
  grafana_data:

Essential PromQL Queries

# CPU usage percentage
100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Memory usage percentage
(1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100

# Disk usage percentage
(1 - node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100

# Network received bytes/sec
rate(node_network_receive_bytes_total{device="eth0"}[5m])

# HTTP request rate
rate(http_requests_total[5m])

# HTTP error rate (5xx)
rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) * 100

# 95th percentile response time
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

Alert Rules

# alert.rules.yml
groups:
- name: server_alerts
  rules:
  - alert: HighCPUUsage
    expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High CPU usage detected"
      description: "CPU usage is above 80% for 5 minutes"

  - alert: HighMemoryUsage
    expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
    for: 5m
    labels:
      severity: critical

  - alert: DiskSpaceLow
    expr: (1 - node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 > 85
    for: 10m
    labels:
      severity: warning

📘 Monitor Like a Pro

Our DevOps monitoring eBooks cover Prometheus, Grafana, ELK Stack, and production alerting strategies.

Browse Monitoring Books →

A properly configured monitoring stack pays for itself the first time it catches an issue before your users do. Start with the basics — CPU, memory, disk — then add application-specific metrics as your system grows.

GitOps with Flux v2: Production Setup with Multi-Tenant Workloads

Flux v2 is the mature production GitOps engine for Kubernetes in 2026, having stabilized into a CNCF graduated project with broad adoption. This is a practical guide for running it at scale: the right repository structure, multi-tenant isolation patterns that actually work, secret management with SOPS or sealed-secrets, image automation, drift detection, and the operational patterns that turn GitOps from a buzzword into a reliable deployment model....

Kubernetes 1.31 Upgrade Guide: Breaking Changes and a Safe Migration Path

Kubernetes 1.31 is one of the more disruptive recent releases — removed in-tree volume plugins, AppArmor going GA, structured authentication maturing, and several long-deprecated APIs finally going away. This is a battle-tested upgrade guide for production clusters: what breaks, what to test on staging, and a safe step-by-step migration path that does not page you at 2 AM....

GitOps Workflow: Managing Infrastructure with Git and ArgoCD

Implement GitOps with ArgoCD for declarative infrastructure management. Learn Git as single source of truth, automated deployment, sync strategies, and rollback procedures....

Categories

Prometheus + Grafana Monitoring Stack: Complete Setup Guide

Docker Compose Deployment

Essential PromQL Queries

Alert Rules

📘 Monitor Like a Pro

Dargslan Editorial Team (Dargslan)

Stay Updated

Categories

Docker Compose Deployment

Essential PromQL Queries

Alert Rules

📘 Monitor Like a Pro

Dargslan Editorial Team (Dargslan)

Related Articles

GitOps with Flux v2: Production Setup with Multi-Tenant Workloads

Kubernetes 1.31 Upgrade Guide: Breaking Changes and a Safe Migration Path

GitOps Workflow: Managing Infrastructure with Git and ArgoCD

Stay Updated