NFS (Network File System) is one of the most widely used network storage protocols in Linux environments. From shared home directories to application data stores, NFS mounts are critical infrastructure. But NFS problems β stale mounts, high latency, and misconfigured exports β can be notoriously difficult to diagnose.
dargslan-nfs-health is a free Python CLI tool that checks NFS mount health, detects stale mounts before they cause application hangs, measures I/O latency, and audits export configurations.
Quick Start
pip install dargslan-nfs-health
dargslan-nfs report # Full NFS health report
dargslan-nfs mounts # List all NFS mounts with status
dargslan-nfs exports # Show NFS exports
dargslan-nfs stats # NFS client statistics
dargslan-nfs throughput -m /mnt/share # I/O throughput test
dargslan-nfs issues # Detected issues
The Stale Mount Problem
A stale NFS mount occurs when the NFS server becomes unreachable but the mount point remains. Any process that tries to access a stale mount will hang indefinitely (with hard mounts) or get errors (with soft mounts). The tool uses os.statvfs() with timeout detection to identify stale mounts without hanging itself.
Latency Measurement
NFS latency directly impacts application performance. The health checker measures the time taken for a statvfs call on each mount point. Latency above 100ms is flagged as a warning, as it indicates network issues or an overloaded NFS server.
Mount Option Auditing
NFS mount options significantly affect reliability and performance. The tool checks for potentially dangerous configurations like soft mounts (which can cause silent data corruption) and missing attribute caching settings.
Throughput Testing
The throughput test writes and reads a test file on the NFS mount to measure actual I/O performance. This is more meaningful than just checking latency, as it reflects real-world file operation speeds including network overhead and server processing time.
Python API
from dargslan_nfs_health import NFSHealth
nh = NFSHealth()
# Check all NFS mounts
for mount in nh.check_all_mounts():
status = "OK" if mount['accessible'] else "STALE" if mount['stale'] else "FAIL"
print(f"[{status}] {mount['source']} -> {mount['mountpoint']}")
if mount.get('latency_ms'):
print(f" Latency: {mount['latency_ms']}ms")
# Run throughput test
result = nh.measure_throughput("/mnt/nfs_share")
print(f"Write: {result['write_mbps']} MB/s, Read: {result['read_mbps']} MB/s")
# Full audit
for issue in nh.audit():
print(f"[{issue['severity']}] {issue['message']}")
Common NFS Issues and Solutions
- Stale file handle: Restart the NFS client, remount the share, or restart nfs-common service
- High latency: Check network connectivity, NFS server load, and consider using NFS v4.1 with session trunking
- Permission denied: Verify export options, check root_squash settings, and ensure UID/GID mapping is correct
- Timeouts: Increase timeo mount option, check firewall rules, and verify rpcbind is running
Automation
# Monitor NFS health every 5 minutes
*/5 * * * * dargslan-nfs issues >> /var/log/nfs-health.log 2>&1
Best Practices
- Use hard mounts with intr option for data integrity with interruptible operations
- Set appropriate timeout values based on your network reliability
- Monitor NFS latency trends to catch degradation early
- Keep NFS client and server versions matched for best compatibility
- Use NFSv4 when possible for better security and performance
Conclusion
NFS mount problems can cause cascading failures across your infrastructure. dargslan-nfs-health gives you proactive monitoring for stale mounts, latency issues, and misconfigured exports. Install it on every system that uses NFS storage.
For more infrastructure tools and Linux administration guides, visit dargslan.com.