NFS Health Monitoring: Free Python CLI for Stale…

NFS (Network File System) is one of the most widely used network storage protocols in Linux environments. From shared home directories to application data stores, NFS mounts are critical infrastructure. But NFS problems — stale mounts, high latency, and misconfigured exports — can be notoriously difficult to diagnose.

dargslan-nfs-health is a free Python CLI tool that checks NFS mount health, detects stale mounts before they cause application hangs, measures I/O latency, and audits export configurations.

Quick Start

pip install dargslan-nfs-health

dargslan-nfs report           # Full NFS health report
dargslan-nfs mounts           # List all NFS mounts with status
dargslan-nfs exports          # Show NFS exports
dargslan-nfs stats            # NFS client statistics
dargslan-nfs throughput -m /mnt/share  # I/O throughput test
dargslan-nfs issues           # Detected issues

The Stale Mount Problem

A stale NFS mount occurs when the NFS server becomes unreachable but the mount point remains. Any process that tries to access a stale mount will hang indefinitely (with hard mounts) or get errors (with soft mounts). The tool uses os.statvfs() with timeout detection to identify stale mounts without hanging itself.

Latency Measurement

NFS latency directly impacts application performance. The health checker measures the time taken for a statvfs call on each mount point. Latency above 100ms is flagged as a warning, as it indicates network issues or an overloaded NFS server.

Mount Option Auditing

NFS mount options significantly affect reliability and performance. The tool checks for potentially dangerous configurations like soft mounts (which can cause silent data corruption) and missing attribute caching settings.

Throughput Testing

The throughput test writes and reads a test file on the NFS mount to measure actual I/O performance. This is more meaningful than just checking latency, as it reflects real-world file operation speeds including network overhead and server processing time.

Python API

from dargslan_nfs_health import NFSHealth

nh = NFSHealth()

# Check all NFS mounts
for mount in nh.check_all_mounts():
    status = "OK" if mount['accessible'] else "STALE" if mount['stale'] else "FAIL"
    print(f"[{status}] {mount['source']} -> {mount['mountpoint']}")
    if mount.get('latency_ms'):
        print(f"  Latency: {mount['latency_ms']}ms")

# Run throughput test
result = nh.measure_throughput("/mnt/nfs_share")
print(f"Write: {result['write_mbps']} MB/s, Read: {result['read_mbps']} MB/s")

# Full audit
for issue in nh.audit():
    print(f"[{issue['severity']}] {issue['message']}")

Common NFS Issues and Solutions

Stale file handle: Restart the NFS client, remount the share, or restart nfs-common service
High latency: Check network connectivity, NFS server load, and consider using NFS v4.1 with session trunking
Permission denied: Verify export options, check root_squash settings, and ensure UID/GID mapping is correct
Timeouts: Increase timeo mount option, check firewall rules, and verify rpcbind is running

Automation

# Monitor NFS health every 5 minutes
*/5 * * * * dargslan-nfs issues >> /var/log/nfs-health.log 2>&1

Best Practices

Use hard mounts with intr option for data integrity with interruptible operations
Set appropriate timeout values based on your network reliability
Monitor NFS latency trends to catch degradation early
Keep NFS client and server versions matched for best compatibility
Use NFSv4 when possible for better security and performance

Conclusion

NFS mount problems can cause cascading failures across your infrastructure. dargslan-nfs-health gives you proactive monitoring for stale mounts, latency issues, and misconfigured exports. Install it on every system that uses NFS storage.

For more infrastructure tools and Linux administration guides, visit dargslan.com.

GitOps with Flux v2: Production Setup with Multi-Tenant Workloads

Flux v2 is the mature production GitOps engine for Kubernetes in 2026, having stabilized into a CNCF graduated project with broad adoption. This is a practical guide for running it at scale: the right repository structure, multi-tenant isolation patterns that actually work, secret management with SOPS or sealed-secrets, image automation, drift detection, and the operational patterns that turn GitOps from a buzzword into a reliable deployment model....

Kubernetes 1.31 Upgrade Guide: Breaking Changes and a Safe Migration Path

Kubernetes 1.31 is one of the more disruptive recent releases — removed in-tree volume plugins, AppArmor going GA, structured authentication maturing, and several long-deprecated APIs finally going away. This is a battle-tested upgrade guide for production clusters: what breaks, what to test on staging, and a safe step-by-step migration path that does not page you at 2 AM....

GitOps Workflow: Managing Infrastructure with Git and ArgoCD

Implement GitOps with ArgoCD for declarative infrastructure management. Learn Git as single source of truth, automated deployment, sync strategies, and rollback procedures....

Categories

NFS Mount Health Monitoring with Python: Detect Stale Mounts, Measure Latency, and Audit Exports (Free CLI Tool)

Quick Start

The Stale Mount Problem

Latency Measurement

Mount Option Auditing

Throughput Testing

Python API

Common NFS Issues and Solutions

Automation

Best Practices

Conclusion

Dargslan Editorial Team (Dargslan)

Stay Updated

Categories

Quick Start

The Stale Mount Problem

Latency Measurement

Mount Option Auditing

Throughput Testing

Python API

Common NFS Issues and Solutions

Automation

Best Practices

Conclusion

Dargslan Editorial Team (Dargslan)

Related Articles

GitOps with Flux v2: Production Setup with Multi-Tenant Workloads

Kubernetes 1.31 Upgrade Guide: Breaking Changes and a Safe Migration Path

GitOps Workflow: Managing Infrastructure with Git and ArgoCD

Stay Updated