A read-only filesystem, a stale NFS mount, or a missing fstab entry can take down a service in seconds โ and the symptoms rarely point at the mount itself. Database writes silently fail, log files truncate, deploy scripts overwrite the wrong directory. This guide covers the mount-monitoring techniques that catch these failures before alerts come from end users.
The right way to inspect mounts
Forget plain mount; findmnt is the modern replacement and renders the actual mount tree:
findmnt # tree view of every mount
findmnt /var/log # exact backing device + options
findmnt -t nfs,nfs4 # only network filesystems
findmnt --verify --verbose # validate fstab against the live system
The last command is gold: it parses /etc/fstab, checks every UUID/label resolves, every mount point exists, and every option is valid. Run it after every fstab change and as part of CI for golden images.
Detecting read-only remounts
A failing disk causes the kernel to remount the filesystem read-only. Detect immediately:
awk '$4 ~ /(^|,)ro(,|$)/ {print $2}' /proc/mounts
findmnt -no OPTIONS / | grep -q '^ro,' && echo 'ROOT IS READ-ONLY'
journalctl -k -p err -b | grep -E 'remount-ro|EXT4-fs error|XFS .*shutdown'
For each filesystem you care about, alert when the options change. Prometheus node_exporter's node_filesystem_readonly metric is the simplest off-the-shelf signal.
Stale NFS detection
Stale NFS handles hang any process that touches the path, often forever. The only reliable detection is a watchdog with a timeout:
timeout 5 ls /mnt/nfs >/dev/null 2>&1
case $? in
0) echo 'OK' ;;
124) echo 'STALE: ls timed out' ;;
*) echo "ERROR exit $?" ;;
esac
For multiple mounts, loop with findmnt -t nfs,nfs4 -no TARGET and run the watchdog per target. Use soft,timeo=30,retrans=2 in mount options to let syscalls fail rather than hang forever; only safe for read-mostly data โ for primary writes, hard mounts remain correct, but pair them with monitoring.
Filesystem-specific health checks
Each major filesystem ships its own health tooling:
# ext4
sudo dumpe2fs -h /dev/sda1 | grep -E 'state|Lifetime|Mount count'
sudo tune2fs -l /dev/sda1 | grep 'Last checked'
# xfs
sudo xfs_info /var
sudo xfs_db -c 'frag -f' -r /dev/sdb1
xfs_repair -n /dev/sdb1 # dry-run
# btrfs
sudo btrfs scrub start -B /
sudo btrfs device stats /
sudo btrfs filesystem usage /
Schedule periodic checks: ext4 every six months (tune2fs -i 6m), btrfs scrub monthly, XFS only on suspicion.
Insecure mount options
Audit mount options for security baselines:
findmnt -no TARGET,OPTIONS | awk '
$1 == "/tmp" && $2 !~ /nodev/ { print "FAIL nodev " $0 }
$1 == "/tmp" && $2 !~ /nosuid/ { print "FAIL nosuid " $0 }
$1 == "/tmp" && $2 !~ /noexec/ { print "FAIL noexec " $0 }
$1 == "/var/tmp" && $2 !~ /nodev/ { print "FAIL nodev " $0 }
$1 == "/dev/shm" && $2 !~ /nodev/ { print "FAIL nodev " $0 }'
CIS benchmarks require nodev, nosuid, noexec on /tmp, /var/tmp, /dev/shm, and /home. Add to /etc/fstab, then mount -o remount /tmp to apply without reboot.
Detecting missing mounts
A common pattern: fstab declares a mount, but it failed at boot. The mount point exists as a regular directory, and writes succeed silently to the root filesystem until it fills up:
expected=$(awk '$1 !~ /^#/ && $2 ~ /^\// {print $2}' /etc/fstab)
for mp in $expected; do
findmnt --target "$mp" --noheadings --output TARGET --first-only \
| grep -qx "$mp" || echo "MISSING $mp"
done
Run on every reboot via systemd unit; if the script reports anything, page someone before the disk fills.
Free space and inode monitoring
Disk-full and inode-exhaustion are different failures. Watch both:
df -h | awk 'NR>1 && $5+0 > 85 {print "SPACE " $0}'
df -i | awk 'NR>1 && $5+0 > 85 {print "INODES " $0}'
Putting it together: a 30-line monitor
A single shell script combining the above checks, run by a systemd timer every five minutes, covers 95% of mount failures most teams ever hit. Pipe its output to your log shipper, alert on anything that contains FAIL, STALE, MISSING, or READ-ONLY.
Common pitfalls
- Mounting NFS in fstab without the
_netdevoption โ boot hangs forever waiting for network. - Forgetting to update
initramfsafter changing root device UUID โ drops to emergency shell. - Setting
noatimeglobally on a filesystem some application requires atime on (looking at you, mutt). - Relying on directory existence as a "mount succeeded" signal.
Mount monitoring is the kind of work that prevents incidents instead of resolving them โ and prevention almost never appears in your weekly metrics. Build the 30-line monitor anyway; the first time it catches a stale NFS mount before users notice, it pays for itself for the next decade.