๐ŸŽ New User? Get 20% off your first purchase with code NEWUSER20 ยท โšก Instant download ยท ๐Ÿ”’ Secure checkout Register Now โ†’
Menu

Categories

XFS Filesystem Health Check: Complete Linux Guide

XFS Filesystem Health Check: Complete Linux Guide

XFS is the default filesystem on RHEL, Rocky, AlmaLinux, and many large-scale storage workloads, and one of the most reliable filesystems Linux ships. But "reliable" does not mean "self-monitoring" โ€” you still need to watch fragmentation, inode usage, and the kernel logs for the unmistakable signs of upcoming trouble. This guide covers the XFS-specific health commands, the metrics worth alerting on, and the recovery workflow when something does go wrong.

Confirm what you have

df -hT                                  # filesystem types per mount
findmnt -t xfs                          # all XFS mounts
sudo xfs_info /var                       # block size, sectsz, inode size, log size
sudo xfs_db -r -c 'sb 0' -c 'p' /dev/sda1 | head      # superblock fields

Healthy xfs_info output shows the volume size in blocks, the inode size (commonly 512 bytes), the log location and size, and whether reflink and crc are enabled. crc=1 is the modern default โ€” every block is checksummed, so silent corruption is detected.

Free space and inodes

XFS allocates inodes dynamically โ€” you cannot exhaust them as easily as on ext4 โ€” but it is still possible:

df -h /var
df -i /var
sudo xfs_quota -x -c 'free -h' /var
sudo xfs_quota -x -c 'report -h' /var          # if quotas enabled

For "no space left on device" with disk free, check inodes; for "filesystem is full" with inodes free, check space. Both can happen on XFS.

Fragmentation reporting

sudo xfs_db -c 'frag -f' -r /dev/sda1
sudo xfs_db -c 'frag' -r /dev/sda1
sudo xfs_bmap -v /var/lib/postgresql/main/base/16384/12345 | head

xfs_db frag -f reports actual fragmentation factor (the higher, the worse). Healthy: under 5%; concerning: above 30%; bad: above 50%. Note that XFS is generally less prone to fragmentation than ext4 on aged filesystems, so high values usually indicate either a write-heavy workload or aggressive snapshotting.

Defragmentation

sudo xfs_fsr                                   # online filesystem reorganiser
sudo xfs_fsr -v -t 600 /var                    # 10-minute time-boxed run
sudo xfs_fsr -v /var/lib/postgresql/main/base/16384/12345    # one file

xfs_fsr runs while the filesystem is mounted and writeable. By default it tries every file under /etc/mtab; -t caps the runtime so it can be scheduled in maintenance windows.

Detecting filesystem errors

sudo dmesg -T | grep -iE 'XFS .*error|XFS .*shutdown|XFS .*corrupt'
sudo journalctl -k -p err -b | grep -i xfs
findmnt /var -no OPTIONS                       # look for 'ro,' or 'shutdown'

XFS shuts down a filesystem on unrecoverable corruption โ€” subsequent writes return I/O errors. The kernel log is the primary signal; alert on any line containing XFS .*shutdown.

Online consistency check

XFS supports an online scrub (kernel 4.15+ with experimental support; production-stable on RHEL 8.4+):

sudo xfs_scrub /var                            # online check
sudo xfs_scrub_all                             # all mounted XFS filesystems
sudo systemctl enable --now xfs_scrub_all.timer  # weekly scheduled scrub

Read-only scrub does not interrupt operations and emits warnings for problems detected. For repair, the filesystem must be unmounted.

Repair procedure

If xfs_scrub reports problems or the kernel shut the filesystem down:

sudo umount /var
sudo xfs_repair -n /dev/sda1                   # dry-run, reports what would be done
sudo xfs_repair /dev/sda1                      # actual repair
sudo mount /var
sudo dmesg -T | tail

Always run xfs_repair -n first. Never run xfs_repair on a mounted filesystem โ€” the result is data loss. If the log is corrupted: xfs_repair -L (zeroes the log; potentially loses recent writes; use only as a last resort).

Backup before repair

For irreplaceable data:

sudo xfs_metadump /dev/sda1 /backup/sda1.metadump   # metadata-only snapshot
sudo dd if=/dev/sda1 of=/backup/sda1.img bs=1M status=progress  # full image (large)

The metadump is small (megabytes vs terabytes), captures the filesystem structure without file contents, and lets upstream xfs developers reproduce corruption you report.

Performance tuning

# /etc/fstab options that often help
UUID=...  /var  xfs  defaults,noatime,nodiratime,inode64,allocsize=16m  0 2
sudo mount -o remount,noatime /var
sudo xfs_io -c 'extsize 16m' /var/lib/big-files

inode64 places inodes anywhere in the filesystem (default on modern XFS); allocsize=16m reduces fragmentation by allocating in 16 MB chunks for streaming writes.

Quotas

sudo mount -o remount,uquota,gquota,pquota /var
sudo xfs_quota -x -c 'limit bsoft=4g bhard=5g user1' /var
sudo xfs_quota -x -c 'report -h' /var

XFS quotas are first-class โ€” separate user, group, and project quota namespaces, all manageable per directory tree.

The audit script

#!/bin/bash
for fs in $(findmnt -t xfs -no SOURCE,TARGET); do
  src=${fs%% *}; tgt=${fs#* }
  echo "== $tgt ($src) =="
  df -h "$tgt" | tail -1
  df -i "$tgt" | tail -1
  ro=$(findmnt -no OPTIONS "$tgt" | grep -c '^ro,\|,ro,\|,ro$')
  [ "$ro" -gt 0 ] && echo "  WARN: read-only"
  err=$(dmesg -T 2>/dev/null | grep -c "XFS.*$src.*error")
  [ "$err" -gt 0 ] && echo "  WARN: $err recent error lines"
done

echo
echo "== Recent XFS shutdown events =="
sudo dmesg -T | grep -i 'XFS.*shutdown' | tail

Common pitfalls

  • Running xfs_repair on a mounted filesystem โ€” corrupts data. Always unmount first.
  • Using xfs_repair -L as a routine fix; it discards the log and may lose seconds of writes.
  • Forgetting that XFS cannot be shrunk; growing is supported via xfs_growfs, shrinking requires recreate-and-restore.
  • Trusting df -i on dynamic-allocation filesystems; XFS reports current inode count, not maximum.

XFS earns its reputation as quiet and dependable, but quiet does not mean unattended. Schedule a weekly xfs_scrub, alert on shutdown events in dmesg, and keep an xfs_metadump handy for the day you need to talk to the kernel mailing list. Five minutes per host per week prevents the rare-but-catastrophic XFS incident.

Share this article:
Dargslan Editorial Team (Dargslan)
About the Author

Dargslan Editorial Team (Dargslan)

Collective of Software Developers, System Administrators, DevOps Engineers, and IT Authors

Dargslan is an independent technology publishing collective formed by experienced software developers, system administrators, and IT specialists.

The Dargslan editorial team works collaboratively to create practical, hands-on technology books focused on real-world use cases. Each publication is developed, reviewed, and...

Programming Languages Linux Administration Web Development Cybersecurity Networking

Stay Updated

Subscribe to our newsletter for the latest tutorials, tips, and exclusive offers.