🎁 New User? Get 20% off your first purchase with code NEWUSER20 Register Now β†’
Menu

Categories

Linux Cgroup Monitoring with Python: Track CPU, Memory, and I/O Limits for Containers (Free CLI Tool)

Linux Cgroup Monitoring with Python: Track CPU, Memory, and I/O Limits for Containers (Free CLI Tool)

Linux control groups (cgroups) are the backbone of container resource management. Whether you are running Docker containers, Kubernetes pods, or systemd services, cgroups enforce CPU, memory, and I/O limits. But monitoring these limits and detecting when resources are exhausted requires dedicated tooling.

In this guide, we will build and use dargslan-cgroup-monitor β€” a free, zero-dependency Python CLI tool that monitors cgroup resource usage across both cgroups v1 and v2. Install it with a single command and get instant visibility into your container and service resource consumption.

What Are Linux Cgroups?

Control groups (cgroups) are a Linux kernel feature that organizes processes into hierarchical groups and applies resource limits. They are used by Docker, Podman, LXC, and systemd to isolate workloads. Cgroups v1 uses separate hierarchies per controller (cpu, memory, blkio), while cgroups v2 unifies everything under a single hierarchy at /sys/fs/cgroup.

Quick Start: Install dargslan-cgroup-monitor

pip install dargslan-cgroup-monitor

After installation, the dargslan-cgroup command is available system-wide:

dargslan-cgroup report        # Full resource report
dargslan-cgroup list          # List all active cgroups
dargslan-cgroup slices        # System slices and services
dargslan-cgroup containers    # Container cgroups only
dargslan-cgroup issues        # Resource limit issues
dargslan-cgroup json          # JSON output for scripting

Understanding Cgroup Resource Limits

Each cgroup can have several resource controllers applied:

  • CPU: Limits CPU time via cpu.max (v2) or cpu.cfs_quota_us (v1). A value of "100000 100000" means 100% of one core.
  • Memory: Hard limit via memory.max (v2) or memory.limit_in_bytes (v1). Exceeding this triggers the OOM killer.
  • I/O: Bandwidth limits via io.max (v2) or blkio.throttle.* (v1). Controls read/write rates per device.
  • PIDs: Maximum process count via pids.max. Prevents fork bombs within a cgroup.

Monitoring Cgroups v2 (Modern Systems)

On systems running cgroups v2 (Ubuntu 22.04+, Fedora 38+, RHEL 9+), the unified hierarchy lives at /sys/fs/cgroup. Our tool reads key files from each cgroup directory:

# Memory usage and limits
/sys/fs/cgroup/system.slice/docker.service/memory.current
/sys/fs/cgroup/system.slice/docker.service/memory.max

# CPU statistics
/sys/fs/cgroup/system.slice/docker.service/cpu.stat

# Process list
/sys/fs/cgroup/system.slice/docker.service/cgroup.procs

Monitoring Cgroups v1 (Legacy Systems)

Legacy systems use separate controller hierarchies. The tool walks each controller directory:

/sys/fs/cgroup/memory/docker/container-id/memory.usage_in_bytes
/sys/fs/cgroup/memory/docker/container-id/memory.limit_in_bytes
/sys/fs/cgroup/cpu/docker/container-id/cpu.cfs_quota_us

Using the Python API

from dargslan_cgroup_monitor import CgroupMonitor

cm = CgroupMonitor()
print(f"Cgroup version: {cm.version}")

# List all cgroups with resource usage
for cg in cm.list_cgroups():
    print(f"{cg['path']}: {cg.get('memory_human', 'N/A')} "
          f"({cg.get('memory_percent', 'N/A')}%)")

# Container-specific cgroups
containers = cm.get_container_cgroups()
for c in containers:
    print(f"Container: {c['path']} using {c.get('memory_human')}")

# Audit for issues
issues = cm.audit()
for issue in issues:
    print(f"[{issue['severity']}] {issue['message']}")

Detecting Container Memory Exhaustion

One of the most critical monitoring tasks is detecting when containers approach their memory limits. When a container reaches its cgroup memory limit, the Linux OOM killer terminates processes inside the container. The audit() method checks for this:

  • Critical: Memory usage above 90% of limit
  • Warning: Memory usage above 75% of limit
  • Info: Container running without a memory limit set

Automating Cgroup Monitoring with Cron

# Run every 5 minutes, save to log
*/5 * * * * dargslan-cgroup issues >> /var/log/cgroup-issues.log 2>&1

# JSON output for monitoring stack integration
*/5 * * * * dargslan-cgroup json > /tmp/cgroup-status.json

Integration with Monitoring Stacks

The JSON output mode makes it easy to feed cgroup data into monitoring systems like Prometheus (via node_exporter textfile collector), Grafana, or ELK stack. Write a simple wrapper that runs dargslan-cgroup json and pushes metrics to your preferred backend.

Systemd Slice Monitoring

Systemd organizes services into slices (user.slice, system.slice, machine.slice). Each slice is a cgroup that can have resource limits. Use dargslan-cgroup slices to see which system services are consuming the most resources and whether any slices are approaching their limits.

Best Practices for Cgroup Resource Management

  1. Always set memory limits on production containers to prevent runaway processes from consuming all host memory
  2. Monitor CPU throttling β€” containers hitting their CPU quota will have increased latency
  3. Use the audit feature regularly to catch containers running without limits
  4. Set up alerting when any cgroup exceeds 80% of its memory limit
  5. Review cgroup hierarchy after system updates, as kernel upgrades may change cgroup behavior

Conclusion

Monitoring cgroup resource usage is essential for maintaining healthy containerized environments. The dargslan-cgroup-monitor tool gives you instant visibility into CPU, memory, and I/O limits across all cgroups on your system. Install it today and start catching resource exhaustion before it causes outages.

For more Linux system administration tools, visit dargslan.com and explore our collection of eBooks and free cheat sheets.

Share this article:
Dargslan Editorial Team (Dargslan)
About the Author

Dargslan Editorial Team (Dargslan)

Collective of Software Developers, System Administrators, DevOps Engineers, and IT Authors

Dargslan is an independent technology publishing collective formed by experienced software developers, system administrators, and IT specialists.

The Dargslan editorial team works collaboratively to create practical, hands-on technology books focused on real-world use cases. Each publication is developed, reviewed, and...

Programming Languages Linux Administration Web Development Cybersecurity Networking

Stay Updated

Subscribe to our newsletter for the latest tutorials, tips, and exclusive offers.