Monitor Systemd Services on Linux: Detect Failed Units with Python (2026)

Dargslan Team | April 12, 2026 | 5 min read | 72 views

Systemd manages every service on modern Linux systems — from SSH and Nginx to Docker and custom application daemons. When a critical service fails, the impact ranges from degraded performance to complete outages. Knowing the status of your services at all times is fundamental to reliable server operations.

This guide covers systemd service monitoring using dargslan-service-monitor, a free Python tool that detects failed units, checks critical services, and gives you a complete overview of your server's service health.

Why Service Monitoring Matters

Services fail silently more often than you think. A misconfigured restart policy means a crashed service stays down. A dependency change after an update prevents a service from starting at boot. A memory leak causes OOM kills that nobody notices until users complain.

Installing dargslan-service-monitor

pip install dargslan-service-monitor

# Or install the complete toolkit
pip install dargslan-toolkit

CLI Usage

# Full service health report
dargslan-svc report

# List failed services
dargslan-svc failed

# List running services
dargslan-svc running

# Check specific service
dargslan-svc status nginx

# Check critical services
dargslan-svc check sshd nginx mysql docker

# List boot-enabled services
dargslan-svc enabled

# JSON output
dargslan-svc json

Python API

from dargslan_service_monitor import ServiceMonitor

sm = ServiceMonitor()

# Full report
sm.print_report()

# Detect failed services (most important check)
failed = sm.get_failed()
if failed:
    print(f"ALERT: {len(failed)} failed services!")
    for f in failed:
        print(f"  FAILED: {f['name']} - {f['description']}")

# Check critical services are running
critical = sm.check_critical_services(["sshd", "nginx", "postgresql"])
for svc in critical:
    status = "OK" if svc['running'] else "DOWN"
    print(f"  [{status}] {svc['name']}")

# Service count statistics
counts = sm.service_count()
print(f"Total: {counts['total']}, Active: {counts.get('active', 0)}, Failed: {counts.get('failed', 0)}")

Essential systemctl Commands

# Service lifecycle
systemctl start nginx
systemctl stop nginx
systemctl restart nginx
systemctl reload nginx       # Graceful config reload
systemctl status nginx       # Detailed status

# Enable/disable at boot
systemctl enable nginx
systemctl disable nginx
systemctl is-enabled nginx

# List services by state
systemctl list-units --type=service --state=failed
systemctl list-units --type=service --state=running
systemctl list-unit-files --type=service --state=enabled

# Reload systemd after unit file changes
systemctl daemon-reload

# Mask service (prevent starting entirely)
systemctl mask service-name
systemctl unmask service-name

Understanding Service States

active (running) — Service is running normally
active (exited) — Service ran successfully and exited (oneshot type)
inactive (dead) — Service is not running
failed — Service failed to start or crashed
activating — Service is starting up
deactivating — Service is shutting down

Creating Custom Service Files

# /etc/systemd/system/myapp.service
[Unit]
Description=My Application Server
After=network.target postgresql.service
Wants=postgresql.service

[Service]
Type=simple
User=appuser
Group=appgroup
WorkingDirectory=/opt/myapp
ExecStart=/opt/myapp/bin/server --config /etc/myapp/config.yml
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5
StandardOutput=journal
StandardError=journal
SyslogIdentifier=myapp

# Resource limits
MemoryMax=512M
CPUQuota=50%

# Security hardening
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
PrivateTmp=yes

[Install]
WantedBy=multi-user.target

Accessing Service Logs

# Follow service logs in real-time
journalctl -u nginx -f

# Logs since last boot
journalctl -u nginx -b

# Logs from today
journalctl -u nginx --since today

# Errors only
journalctl -u nginx -p err

# Last 100 lines
journalctl -u nginx -n 100

# Time range
journalctl -u nginx --since "2026-04-01 00:00" --until "2026-04-02 00:00"

# Disk usage of journal
journalctl --disk-usage

# Cleanup
journalctl --vacuum-size=100M

Automated Service Monitoring

#!/usr/bin/env python3
# /opt/scripts/service-check.py
from dargslan_service_monitor import ServiceMonitor

CRITICAL_SERVICES = ["sshd", "nginx", "postgresql", "docker"]

sm = ServiceMonitor()

# Check for any failed services
failed = sm.get_failed()
if failed:
    print(f"CRITICAL: {len(failed)} services failed!")
    for f in failed:
        print(f"  FAILED: {f['name']}")

# Check critical services
for svc in sm.check_critical_services(CRITICAL_SERVICES):
    if not svc['running']:
        print(f"CRITICAL: {svc['name']} is DOWN!")

⚙️ Master Linux Service Management

Our Linux administration eBooks cover systemd in depth — unit files, timers, socket activation, cgroups, and production service management patterns.

Browse Linux Books →

Service monitoring is non-negotiable for production servers. dargslan-service-monitor gives you instant visibility into systemd service health — failed units, running services, and critical service checks — all without installing heavy monitoring agents.

Install now: pip install dargslan-service-monitor — or get all 15 tools: pip install dargslan-toolkit

Download our free Systemd Service Monitor Cheat Sheet for quick reference.

Categories

Monitor Systemd Services on Linux: Detect Failed Units with Python (2026)

Why Service Monitoring Matters

Installing dargslan-service-monitor

CLI Usage

Python API

Essential systemctl Commands

Understanding Service States

Creating Custom Service Files

Accessing Service Logs

Automated Service Monitoring

⚙️ Master Linux Service Management

Dargslan Editorial Team (Dargslan)

Stay Updated

Categories

Why Service Monitoring Matters

Installing dargslan-service-monitor

CLI Usage

Python API

Essential systemctl Commands

Understanding Service States

Creating Custom Service Files

Accessing Service Logs

Automated Service Monitoring

⚙️ Master Linux Service Management

Dargslan Editorial Team (Dargslan)

Related Articles

Linux Locale and Encoding: Fixing UTF-8 Issues and Language Configuration

GRUB Bootloader: Validating Configuration, Kernel Parameters, and Boot Recovery

Linux Network Interface Monitoring: Link Status, Traffic, and Error Detection

Stay Updated