XFS: High-Performance Filesystem

14 min

Explore XFS - the high-performance 64-bit journaling filesystem optimized for large files and parallel I/O. Learn why it excels at handling massive data workloads.

Best viewed on desktop for optimal interactive experience

What is XFS?

XFS is a high-performance 64-bit journaling filesystem created by Silicon Graphics (SGI) in 1993 for their IRIX operating system. Ported to Linux in 2001, XFS excels at parallel I/O operations and handling large files, making it the default filesystem for Red Hat Enterprise Linux.

Think of XFS as the Formula 1 car of filesystems - built for speed, especially when dealing with large files and multiple concurrent operations.

Key Design Goals

XFS was designed with specific objectives:

  1. Extreme scalability - Support massive files and filesystems
  2. High performance - Maximize throughput for large operations
  3. Efficient space usage - Minimize fragmentation
  4. Parallel operations - Scale with multiple CPUs/cores

XFS Architecture

Allocation Groups (AGs)

XFS divides the filesystem into allocation groups - independent regions that can be managed in parallel:

# XFS Allocation Groups Structure: # Filesystem (1TB) # ├── AG 0 (256GB) # │ ├── Superblock # │ ├── Free space B+trees # │ ├── Inode B+trees # │ └── Data blocks # ├── AG 1 (256GB) # ├── AG 2 (256GB) # └── AG 3 (256GB) # # Benefits: # - Parallel operations across AGs # - Reduced lock contention # - Better CPU cache utilization # - Scalable performance

B+tree Everything

XFS uses B+trees for all metadata:

  • Free space management - Two B+trees per AG
  • Inode allocation - B+tree indexes
  • Directory entries - B+tree for large directories
  • Extent maps - B+tree for file extents

Creating and Managing XFS

Creating XFS Filesystem

# Basic creation sudo mkfs.xfs /dev/sdb1 # With options sudo mkfs.xfs -L "DataDrive" \ -b size=4096 \ # Block size -m crc=1,finobt=1 \ # Metadata checksums, free inode btree /dev/sdb1 # For RAID arrays (stripe-aligned) sudo mkfs.xfs -d su=64k,sw=4 /dev/md0 # 64KB stripe unit, 4 stripes # For SSDs sudo mkfs.xfs -m crc=1,finobt=1 -d discardblocks=1 /dev/nvme0n1 # Large filesystem optimizations sudo mkfs.xfs -d agcount=32 /dev/sdb # More AGs for parallelism

Mount Options

# Basic mount sudo mount /dev/sdb1 /mnt/data # Performance options sudo mount -o noatime,nodiratime,nobarrier /dev/sdb1 /mnt/data # For databases sudo mount -o noatime,nodiratime,largeio,swalloc /dev/sdb1 /mnt/db # SSD optimizations sudo mount -o noatime,discard /dev/sdb1 /mnt/ssd # NUMA optimizations sudo mount -o noatime,inode64,allocsize=16m /dev/sdb1 /mnt/data

Mount Options Explained

noatime # Don't update access times nodiratime # Don't update directory access times nobarrier # Disable write barriers (risky but fast) largeio # Optimize for large I/O inode64 # Allow inodes anywhere (better for large fs) allocsize= # Preallocation size for extending writes swalloc # Stripe-width aligned allocation discard # Enable TRIM for SSDs

Key Features

1. Extent-based Allocation

XFS uses extents (contiguous blocks) instead of individual blocks:

# Traditional block allocation (ext3): # File: [block 1000][block 1001][block 1002]...[block 2000] # Metadata: 1001 entries # # XFS extent allocation: # File: [extent: start=1000, length=1001] # Metadata: 1 entry # # Benefits: # - Less metadata overhead # - Better performance for large files # - Reduced fragmentation

2. Delayed Allocation

XFS delays block allocation until data is flushed:

# Without delayed allocation: # 1. Write() called → Allocate blocks immediately # 2. Data in page cache # 3. Flush to disk → May be fragmented # # With delayed allocation (XFS): # 1. Write() called → Reserve space only # 2. Data accumulated in page cache # 3. Flush to disk → Allocate contiguous blocks

3. Direct I/O Support

Bypass page cache for database workloads:

// Open file with O_DIRECT int fd = open("/mnt/xfs/database.db", O_RDWR | O_DIRECT); // Aligned buffer required void *buffer; posix_memalign(&buffer, 512, 4096); // Direct read/write read(fd, buffer, 4096);

Performance Features

Parallel I/O

XFS excels at concurrent operations:

# Test parallel write performance for i in {1..16}; do dd if=/dev/zero of=/mnt/xfs/file$i bs=1G count=10 & done wait # XFS handles this efficiently due to: # - Multiple allocation groups # - Per-AG locks # - Parallel metadata operations

Preallocation

Reserve space for files:

# Preallocate space fallocate -l 10G /mnt/xfs/largefile # Or using xfs_io xfs_io -f -c "falloc 0 10g" /mnt/xfs/largefile # Benefits: # - Guaranteed contiguous space # - No fragmentation # - Prevents ENOSPC during writes

Real-time Subvolume

Separate device for guaranteed bandwidth:

# Create with real-time subvolume sudo mkfs.xfs -r rtdev=/dev/sdc1 /dev/sdb1 # Mount with real-time device sudo mount -o rtdev=/dev/sdc1 /dev/sdb1 /mnt # Use real-time files xfs_io -f -c "chattr +r" /mnt/realtime-file

XFS Tools and Utilities

xfs_info - Filesystem Information

# Show filesystem information sudo xfs_info /mnt # Output includes: # - Block size and count # - AG count and size # - Inode information # - Log information # - Realtime configuration

xfs_growfs - Expand Filesystem

# Grow filesystem to fill partition sudo xfs_growfs /mnt # Grow to specific size sudo xfs_growfs -D 1000000 /mnt # Note: XFS can only grow, not shrink!

xfs_repair - Filesystem Repair

# Check filesystem (read-only) sudo xfs_repair -n /dev/sdb1 # Repair filesystem (must be unmounted) sudo umount /mnt sudo xfs_repair /dev/sdb1 # Force repair (dangerous!) sudo xfs_repair -L /dev/sdb1 # Zero log # Repair with memory limit sudo xfs_repair -m 2048 /dev/sdb1

xfs_db - Debug Filesystem

# Open filesystem debugger sudo xfs_db /dev/sdb1 # Commands in xfs_db: xfs_db> sb 0 # Show superblock xfs_db> frag # Show fragmentation xfs_db> freesp # Show free space xfs_db> quit

xfs_fsr - Defragmentation

# Defragment filesystem sudo xfs_fsr /mnt # Defragment specific file sudo xfs_fsr /mnt/fragmented-file # Verbose output sudo xfs_fsr -v /mnt

xfs_freeze - Freeze Filesystem

# Freeze filesystem (for snapshots) sudo xfs_freeze -f /mnt # Unfreeze sudo xfs_freeze -u /mnt # Use case: LVM snapshots sudo xfs_freeze -f /mnt sudo lvcreate -s -n snapshot -L 10G /dev/vg/lv sudo xfs_freeze -u /mnt

Performance Optimization

For Large Files

# Mount options for media servers mount -o allocsize=1g,largeio /dev/sdb1 /mnt/media # Increase readahead blockdev --setra 4096 /dev/sdb1 # Use extent size hints xfs_io -c "extsize 1g" /mnt/media/video.mp4

For Databases

# Create optimized filesystem mkfs.xfs -b size=4096 -d agcount=16 /dev/sdb1 # Mount with database-friendly options mount -o noatime,nodiratime,nobarrier,logbufs=8 /dev/sdb1 /mnt/db # Set extent size for database files xfs_io -c "extsize 64k" /mnt/db/table.ibd

For Many Small Files

# More allocation groups mkfs.xfs -d agcount=64 /dev/sdb1 # Enable free inode B+tree mkfs.xfs -m finobt=1 /dev/sdb1 # Mount options mount -o noatime,inode64 /dev/sdb1 /mnt

XFS Quotas

Project Quotas

Unique to XFS - directory tree quotas:

# Enable quotas at mount mount -o prjquota /dev/sdb1 /mnt # Create project echo "10:/mnt/project1" >> /etc/projects echo "project1:10" >> /etc/projid # Initialize project xfs_quota -x -c "project -s project1" /mnt # Set limits xfs_quota -x -c "limit -p bsoft=5g bhard=10g project1" /mnt # Check usage xfs_quota -c "report -h" /mnt

User and Group Quotas

# Enable quotas mount -o usrquota,grpquota /dev/sdb1 /mnt # Set user quota xfs_quota -x -c "limit -u bsoft=5g bhard=10g alice" /mnt # Set group quota xfs_quota -x -c "limit -g bsoft=50g bhard=100g developers" /mnt # Report quotas xfs_quota -c "report -h" /mnt

Backup and Recovery

xfsdump and xfsrestore

Native backup tools for XFS:

# Full backup sudo xfsdump -f /backup/full.dump -L "Full Backup" -M "Tape1" /mnt # Incremental backup sudo xfsdump -l 1 -f /backup/incr.dump -L "Incremental" /mnt # Restore full backup sudo xfsrestore -f /backup/full.dump /mnt/restore # Interactive restore sudo xfsrestore -i -f /backup/full.dump /mnt/restore # List contents sudo xfsrestore -t -f /backup/full.dump

Metadata Dumps

# Dump metadata for analysis xfs_metadump /dev/sdb1 metadata.dump # Restore metadata (testing only!) xfs_mdrestore metadata.dump /dev/sdc1

Troubleshooting XFS

Common Issues

1. "No space left" with free space

# Check inode usage df -i /mnt # Check for deleted but open files lsof +L1 /mnt # Clear reserved blocks xfs_io -x -c "resblks 0" /mnt

2. Mount fails after crash

# Try mounting with no recovery mount -o ro,norecovery /dev/sdb1 /mnt # Zero log if corrupted xfs_repair -L /dev/sdb1 # Then mount normally mount /dev/sdb1 /mnt

3. Poor performance

# Check fragmentation xfs_db -c frag -r /dev/sdb1 # Defragment if needed xfs_fsr -v /mnt # Check for allocation group imbalance xfs_info /mnt | grep agcount

XFS vs Other Filesystems

Performance Comparison

# Performance Comparison: # # Large Sequential Writes: # XFS ████████████████████ 100% # ext4 ███████████████████ 95% # Btrfs ████████████████ 80% # ZFS ███████████████ 75% # # Parallel I/O: # XFS ████████████████████ 100% # ext4 ████████████████ 80% # Btrfs ███████████████ 75% # ZFS ██████████████ 70% # # Metadata Operations: # ext4 ████████████████████ 100% # XFS ██████████████████ 90% # Btrfs ████████████████ 80% # ZFS ███████████████ 75%

Feature Comparison

FeatureXFSext4BtrfsZFS
Max file size8 EiB16 TiB16 EiB16 EiB
Max volume8 EiB1 EiB16 EiB256 ZiB
Snapshots
Compression
ChecksumsMetadata only
Shrinking
Performance█████████████

Best Practices

1. Filesystem Creation

# For general use mkfs.xfs -m crc=1,finobt=1 /dev/sdb1 # For large files mkfs.xfs -d agcount=8 -l size=128m /dev/sdb1 # For many files mkfs.xfs -d agcount=32 -m finobt=1 /dev/sdb1

2. Regular Maintenance

# Weekly fragmentation check 0 2 * * 0 xfs_db -c frag -r /dev/sdb1 # Monthly defrag if needed 0 3 1 * * xfs_fsr /mnt # Regular quota checks 0 0 * * * xfs_quota -c "report -h" /mnt

3. Monitoring Script

#!/bin/bash # XFS health check MOUNT="/mnt" # Check fragmentation FRAG=$(xfs_db -c frag -r $(findmnt -n -o SOURCE $MOUNT) | grep factor | awk '{print $NF}') if (( $(echo "$FRAG > 20" | bc -l) )); then echo "High fragmentation: $FRAG%" fi # Check space usage USAGE=$(df -h $MOUNT | awk 'NR==2 {print $5}' | sed 's/%//') if [ $USAGE -gt 90 ]; then echo "High disk usage: $USAGE%" fi # Check for errors dmesg | grep -i xfs | grep -i error

When to Use XFS

✅ Perfect for:

  • Media servers - Large file streaming
  • Scientific computing - Parallel I/O workloads
  • Databases - With proper tuning
  • Virtual machine storage - Good performance
  • RHEL/CentOS systems - Default and well-supported
  • High-performance computing - Scales with hardware

❌ Consider alternatives for:

  • Root filesystem on desktop - ext4 simpler
  • Need snapshots - Use Btrfs or ZFS
  • Need compression - Use Btrfs or ZFS
  • Small embedded systems - Too heavyweight
  • Need to shrink filesystem - Not supported

XFS Limitations

  1. Cannot shrink - Only grows, plan accordingly
  2. No snapshots - Use LVM or switch to Btrfs/ZFS
  3. No compression - Must handle at application level
  4. 32-bit limitations - Requires 64-bit kernel
  5. Recovery limitations - Less robust than ext4's fsck

Future Development

Ongoing Work

  • Reflink support - Copy-on-write file copies
  • Online repair - Fix filesystem while mounted
  • Reverse mapping - Better error reporting
  • Parent pointers - Improved directory operations
  • Y2038 fixes - Beyond 32-bit timestamps

Migration to XFS

From ext4

# No in-place conversion, must copy # Create XFS filesystem mkfs.xfs /dev/sdc1 # Mount both mount /dev/sdb1 /mnt/ext4 mount /dev/sdc1 /mnt/xfs # Copy with xfsdump/restore (preserves XFS features) xfsdump -J - /mnt/ext4 | xfsrestore -J - /mnt/xfs # Or use rsync rsync -avxHAX --progress /mnt/ext4/ /mnt/xfs/

Conclusion

XFS represents the pinnacle of traditional filesystem performance. While it lacks modern features like snapshots and compression, it excels at what it does: handling large files and parallel I/O with exceptional speed and reliability.

For workloads involving large files, streaming media, or parallel operations, XFS is often unmatched. Its maturity, stability, and integration with enterprise Linux distributions make it a safe choice for production systems.

The trade-off is clear: maximum performance and scalability for traditional filesystem operations, but without the advanced features of newer CoW filesystems. For many workloads, especially in enterprise environments, this trade-off is exactly right.

← Back to Filesystems Overview | Compare with ext4 →

If you found this explanation helpful, consider sharing it with others.

Mastodon