XFS: High-Performance Filesystem
Explore XFS - the high-performance 64-bit journaling filesystem optimized for large files and parallel I/O. Learn why it excels at handling massive data workloads.
Best viewed on desktop for optimal interactive experience
What is XFS?
XFS is a high-performance 64-bit journaling filesystem created by Silicon Graphics (SGI) in 1993 for their IRIX operating system. Ported to Linux in 2001, XFS excels at parallel I/O operations and handling large files, making it the default filesystem for Red Hat Enterprise Linux.
Think of XFS as the Formula 1 car of filesystems - built for speed, especially when dealing with large files and multiple concurrent operations.
Key Design Goals
XFS was designed with specific objectives:
- Extreme scalability - Support massive files and filesystems
- High performance - Maximize throughput for large operations
- Efficient space usage - Minimize fragmentation
- Parallel operations - Scale with multiple CPUs/cores
XFS Architecture
Allocation Groups (AGs)
XFS divides the filesystem into allocation groups - independent regions that can be managed in parallel:
# XFS Allocation Groups Structure: # Filesystem (1TB) # ├── AG 0 (256GB) # │ ├── Superblock # │ ├── Free space B+trees # │ ├── Inode B+trees # │ └── Data blocks # ├── AG 1 (256GB) # ├── AG 2 (256GB) # └── AG 3 (256GB) # # Benefits: # - Parallel operations across AGs # - Reduced lock contention # - Better CPU cache utilization # - Scalable performance
B+tree Everything
XFS uses B+trees for all metadata:
- Free space management - Two B+trees per AG
- Inode allocation - B+tree indexes
- Directory entries - B+tree for large directories
- Extent maps - B+tree for file extents
Creating and Managing XFS
Creating XFS Filesystem
# Basic creation sudo mkfs.xfs /dev/sdb1 # With options sudo mkfs.xfs -L "DataDrive" \ -b size=4096 \ # Block size -m crc=1,finobt=1 \ # Metadata checksums, free inode btree /dev/sdb1 # For RAID arrays (stripe-aligned) sudo mkfs.xfs -d su=64k,sw=4 /dev/md0 # 64KB stripe unit, 4 stripes # For SSDs sudo mkfs.xfs -m crc=1,finobt=1 -d discardblocks=1 /dev/nvme0n1 # Large filesystem optimizations sudo mkfs.xfs -d agcount=32 /dev/sdb # More AGs for parallelism
Mount Options
# Basic mount sudo mount /dev/sdb1 /mnt/data # Performance options sudo mount -o noatime,nodiratime,nobarrier /dev/sdb1 /mnt/data # For databases sudo mount -o noatime,nodiratime,largeio,swalloc /dev/sdb1 /mnt/db # SSD optimizations sudo mount -o noatime,discard /dev/sdb1 /mnt/ssd # NUMA optimizations sudo mount -o noatime,inode64,allocsize=16m /dev/sdb1 /mnt/data
Mount Options Explained
noatime # Don't update access times nodiratime # Don't update directory access times nobarrier # Disable write barriers (risky but fast) largeio # Optimize for large I/O inode64 # Allow inodes anywhere (better for large fs) allocsize= # Preallocation size for extending writes swalloc # Stripe-width aligned allocation discard # Enable TRIM for SSDs
Key Features
1. Extent-based Allocation
XFS uses extents (contiguous blocks) instead of individual blocks:
# Traditional block allocation (ext3): # File: [block 1000][block 1001][block 1002]...[block 2000] # Metadata: 1001 entries # # XFS extent allocation: # File: [extent: start=1000, length=1001] # Metadata: 1 entry # # Benefits: # - Less metadata overhead # - Better performance for large files # - Reduced fragmentation
2. Delayed Allocation
XFS delays block allocation until data is flushed:
# Without delayed allocation: # 1. Write() called → Allocate blocks immediately # 2. Data in page cache # 3. Flush to disk → May be fragmented # # With delayed allocation (XFS): # 1. Write() called → Reserve space only # 2. Data accumulated in page cache # 3. Flush to disk → Allocate contiguous blocks
3. Direct I/O Support
Bypass page cache for database workloads:
// Open file with O_DIRECT int fd = open("/mnt/xfs/database.db", O_RDWR | O_DIRECT); // Aligned buffer required void *buffer; posix_memalign(&buffer, 512, 4096); // Direct read/write read(fd, buffer, 4096);
Performance Features
Parallel I/O
XFS excels at concurrent operations:
# Test parallel write performance for i in {1..16}; do dd if=/dev/zero of=/mnt/xfs/file$i bs=1G count=10 & done wait # XFS handles this efficiently due to: # - Multiple allocation groups # - Per-AG locks # - Parallel metadata operations
Preallocation
Reserve space for files:
# Preallocate space fallocate -l 10G /mnt/xfs/largefile # Or using xfs_io xfs_io -f -c "falloc 0 10g" /mnt/xfs/largefile # Benefits: # - Guaranteed contiguous space # - No fragmentation # - Prevents ENOSPC during writes
Real-time Subvolume
Separate device for guaranteed bandwidth:
# Create with real-time subvolume sudo mkfs.xfs -r rtdev=/dev/sdc1 /dev/sdb1 # Mount with real-time device sudo mount -o rtdev=/dev/sdc1 /dev/sdb1 /mnt # Use real-time files xfs_io -f -c "chattr +r" /mnt/realtime-file
XFS Tools and Utilities
xfs_info - Filesystem Information
# Show filesystem information sudo xfs_info /mnt # Output includes: # - Block size and count # - AG count and size # - Inode information # - Log information # - Realtime configuration
xfs_growfs - Expand Filesystem
# Grow filesystem to fill partition sudo xfs_growfs /mnt # Grow to specific size sudo xfs_growfs -D 1000000 /mnt # Note: XFS can only grow, not shrink!
xfs_repair - Filesystem Repair
# Check filesystem (read-only) sudo xfs_repair -n /dev/sdb1 # Repair filesystem (must be unmounted) sudo umount /mnt sudo xfs_repair /dev/sdb1 # Force repair (dangerous!) sudo xfs_repair -L /dev/sdb1 # Zero log # Repair with memory limit sudo xfs_repair -m 2048 /dev/sdb1
xfs_db - Debug Filesystem
# Open filesystem debugger sudo xfs_db /dev/sdb1 # Commands in xfs_db: xfs_db> sb 0 # Show superblock xfs_db> frag # Show fragmentation xfs_db> freesp # Show free space xfs_db> quit
xfs_fsr - Defragmentation
# Defragment filesystem sudo xfs_fsr /mnt # Defragment specific file sudo xfs_fsr /mnt/fragmented-file # Verbose output sudo xfs_fsr -v /mnt
xfs_freeze - Freeze Filesystem
# Freeze filesystem (for snapshots) sudo xfs_freeze -f /mnt # Unfreeze sudo xfs_freeze -u /mnt # Use case: LVM snapshots sudo xfs_freeze -f /mnt sudo lvcreate -s -n snapshot -L 10G /dev/vg/lv sudo xfs_freeze -u /mnt
Performance Optimization
For Large Files
# Mount options for media servers mount -o allocsize=1g,largeio /dev/sdb1 /mnt/media # Increase readahead blockdev --setra 4096 /dev/sdb1 # Use extent size hints xfs_io -c "extsize 1g" /mnt/media/video.mp4
For Databases
# Create optimized filesystem mkfs.xfs -b size=4096 -d agcount=16 /dev/sdb1 # Mount with database-friendly options mount -o noatime,nodiratime,nobarrier,logbufs=8 /dev/sdb1 /mnt/db # Set extent size for database files xfs_io -c "extsize 64k" /mnt/db/table.ibd
For Many Small Files
# More allocation groups mkfs.xfs -d agcount=64 /dev/sdb1 # Enable free inode B+tree mkfs.xfs -m finobt=1 /dev/sdb1 # Mount options mount -o noatime,inode64 /dev/sdb1 /mnt
XFS Quotas
Project Quotas
Unique to XFS - directory tree quotas:
# Enable quotas at mount mount -o prjquota /dev/sdb1 /mnt # Create project echo "10:/mnt/project1" >> /etc/projects echo "project1:10" >> /etc/projid # Initialize project xfs_quota -x -c "project -s project1" /mnt # Set limits xfs_quota -x -c "limit -p bsoft=5g bhard=10g project1" /mnt # Check usage xfs_quota -c "report -h" /mnt
User and Group Quotas
# Enable quotas mount -o usrquota,grpquota /dev/sdb1 /mnt # Set user quota xfs_quota -x -c "limit -u bsoft=5g bhard=10g alice" /mnt # Set group quota xfs_quota -x -c "limit -g bsoft=50g bhard=100g developers" /mnt # Report quotas xfs_quota -c "report -h" /mnt
Backup and Recovery
xfsdump and xfsrestore
Native backup tools for XFS:
# Full backup sudo xfsdump -f /backup/full.dump -L "Full Backup" -M "Tape1" /mnt # Incremental backup sudo xfsdump -l 1 -f /backup/incr.dump -L "Incremental" /mnt # Restore full backup sudo xfsrestore -f /backup/full.dump /mnt/restore # Interactive restore sudo xfsrestore -i -f /backup/full.dump /mnt/restore # List contents sudo xfsrestore -t -f /backup/full.dump
Metadata Dumps
# Dump metadata for analysis xfs_metadump /dev/sdb1 metadata.dump # Restore metadata (testing only!) xfs_mdrestore metadata.dump /dev/sdc1
Troubleshooting XFS
Common Issues
1. "No space left" with free space
# Check inode usage df -i /mnt # Check for deleted but open files lsof +L1 /mnt # Clear reserved blocks xfs_io -x -c "resblks 0" /mnt
2. Mount fails after crash
# Try mounting with no recovery mount -o ro,norecovery /dev/sdb1 /mnt # Zero log if corrupted xfs_repair -L /dev/sdb1 # Then mount normally mount /dev/sdb1 /mnt
3. Poor performance
# Check fragmentation xfs_db -c frag -r /dev/sdb1 # Defragment if needed xfs_fsr -v /mnt # Check for allocation group imbalance xfs_info /mnt | grep agcount
XFS vs Other Filesystems
Performance Comparison
# Performance Comparison: # # Large Sequential Writes: # XFS ████████████████████ 100% # ext4 ███████████████████ 95% # Btrfs ████████████████ 80% # ZFS ███████████████ 75% # # Parallel I/O: # XFS ████████████████████ 100% # ext4 ████████████████ 80% # Btrfs ███████████████ 75% # ZFS ██████████████ 70% # # Metadata Operations: # ext4 ████████████████████ 100% # XFS ██████████████████ 90% # Btrfs ████████████████ 80% # ZFS ███████████████ 75%
Feature Comparison
Feature | XFS | ext4 | Btrfs | ZFS |
---|---|---|---|---|
Max file size | 8 EiB | 16 TiB | 16 EiB | 16 EiB |
Max volume | 8 EiB | 1 EiB | 16 EiB | 256 ZiB |
Snapshots | ✗ | ✗ | ✓ | ✓ |
Compression | ✗ | ✗ | ✓ | ✓ |
Checksums | Metadata only | ✗ | ✓ | ✓ |
Shrinking | ✗ | ✓ | ✓ | ✗ |
Performance | ████ | ███ | ███ | ███ |
Best Practices
1. Filesystem Creation
# For general use mkfs.xfs -m crc=1,finobt=1 /dev/sdb1 # For large files mkfs.xfs -d agcount=8 -l size=128m /dev/sdb1 # For many files mkfs.xfs -d agcount=32 -m finobt=1 /dev/sdb1
2. Regular Maintenance
# Weekly fragmentation check 0 2 * * 0 xfs_db -c frag -r /dev/sdb1 # Monthly defrag if needed 0 3 1 * * xfs_fsr /mnt # Regular quota checks 0 0 * * * xfs_quota -c "report -h" /mnt
3. Monitoring Script
#!/bin/bash # XFS health check MOUNT="/mnt" # Check fragmentation FRAG=$(xfs_db -c frag -r $(findmnt -n -o SOURCE $MOUNT) | grep factor | awk '{print $NF}') if (( $(echo "$FRAG > 20" | bc -l) )); then echo "High fragmentation: $FRAG%" fi # Check space usage USAGE=$(df -h $MOUNT | awk 'NR==2 {print $5}' | sed 's/%//') if [ $USAGE -gt 90 ]; then echo "High disk usage: $USAGE%" fi # Check for errors dmesg | grep -i xfs | grep -i error
When to Use XFS
✅ Perfect for:
- Media servers - Large file streaming
- Scientific computing - Parallel I/O workloads
- Databases - With proper tuning
- Virtual machine storage - Good performance
- RHEL/CentOS systems - Default and well-supported
- High-performance computing - Scales with hardware
❌ Consider alternatives for:
- Root filesystem on desktop - ext4 simpler
- Need snapshots - Use Btrfs or ZFS
- Need compression - Use Btrfs or ZFS
- Small embedded systems - Too heavyweight
- Need to shrink filesystem - Not supported
XFS Limitations
- Cannot shrink - Only grows, plan accordingly
- No snapshots - Use LVM or switch to Btrfs/ZFS
- No compression - Must handle at application level
- 32-bit limitations - Requires 64-bit kernel
- Recovery limitations - Less robust than ext4's fsck
Future Development
Ongoing Work
- Reflink support - Copy-on-write file copies
- Online repair - Fix filesystem while mounted
- Reverse mapping - Better error reporting
- Parent pointers - Improved directory operations
- Y2038 fixes - Beyond 32-bit timestamps
Migration to XFS
From ext4
# No in-place conversion, must copy # Create XFS filesystem mkfs.xfs /dev/sdc1 # Mount both mount /dev/sdb1 /mnt/ext4 mount /dev/sdc1 /mnt/xfs # Copy with xfsdump/restore (preserves XFS features) xfsdump -J - /mnt/ext4 | xfsrestore -J - /mnt/xfs # Or use rsync rsync -avxHAX --progress /mnt/ext4/ /mnt/xfs/
Conclusion
XFS represents the pinnacle of traditional filesystem performance. While it lacks modern features like snapshots and compression, it excels at what it does: handling large files and parallel I/O with exceptional speed and reliability.
For workloads involving large files, streaming media, or parallel operations, XFS is often unmatched. Its maturity, stability, and integration with enterprise Linux distributions make it a safe choice for production systems.
The trade-off is clear: maximum performance and scalability for traditional filesystem operations, but without the advanced features of newer CoW filesystems. For many workloads, especially in enterprise environments, this trade-off is exactly right.