ext4: The Linux Workhorse Filesystem

12 min

Deep dive into ext4 (fourth extended filesystem) - the default filesystem for most Linux distributions. Learn about journaling, extents, and why ext4 remains the reliable choice.

Best viewed on desktop for optimal interactive experience

The ext4 Story: Why Boring is Beautiful

Picture this: It's 3 AM, your server just crashed, and you're frantically rebooting. Will your data be there? Will the filesystem be corrupted? With ext4, you can breathe easy. This is the filesystem that millions trust with their data every single day.

ext4 isn't trying to win any innovation awards. It's not the fastest (that's XFS), not the most feature-rich (hello ZFS), and definitely not the most modern (looking at you, Btrfs). But here's the thing—ext4 is the filesystem that just works. It's been battle-tested for over 15 years, handling everything from tiny Raspberry Pis to massive enterprise servers.

Think of ext4 as the Toyota Camry of filesystems. It won't turn heads at a car show, but it'll reliably get you to work every day for the next 200,000 miles without breaking a sweat.

Evolution from ext2 → ext3 → ext4

# Evolution of ext filesystems: # # ext2 (1993) # ├─ Basic Unix filesystem # ├─ No journaling # └─ Fast but risky # # ext3 (2001) # ├─ Added journaling # ├─ Backward compatible with ext2 # └─ Safer but slower # # ext4 (2008) # ├─ Extents instead of block maps # ├─ Larger filesystem/file sizes # ├─ Better performance # └─ Delayed allocation

Key Features of ext4

1. Journaling: Your Data's Safety Net

The Problem: Imagine you're updating a file when suddenly—power outage! Without journaling, your filesystem could be left in an inconsistent state, with half-written data and corrupted metadata. Recovery could take hours of fsck scanning.

The Solution: ext4's journal acts like a transaction log. Before making any changes, ext4 writes them to the journal first. If the system crashes, it simply replays the journal on boot—recovery in seconds, not hours!

ext4 Journaling Modes

Journaling Process Visualization

Idle
Write to Journal
Write Data Blocks
Commit Transaction
Checkpoint
Complete
Journal Log
Page Cache
Dirty Pages: 0/16
Storage Device
Journal Area
Data Blocks

Crash Recovery with Journaling

Without Journaling
Full filesystem check (fsck) required
Can take hours on large filesystems
Possible data loss or corruption
With ext4 Journaling
Replay journal on mount (~seconds)
Guaranteed filesystem consistency
Minimal to no data loss

Understanding Journal Modes

Ext4 offers three journaling modes, each with different trade-offs:

# Check current journal settings sudo dumpe2fs /dev/sda1 | grep -i journal # Journal modes explained: # 1. journal (data=journal) # - Both data AND metadata journaled # - Safest but slowest (everything written twice) # - Use for: Critical databases, financial data # 2. ordered (data=ordered) - DEFAULT # - Only metadata journaled, but data written first # - Good balance of safety and performance # - Use for: General purpose, most workloads # 3. writeback (data=writeback) # - Only metadata journaled, data can be written anytime # - Fastest but can lead to old data after crash # - Use for: Scratch disks, temporary data # Change journal mode sudo tune2fs -o journal_data /dev/sda1 # Full journaling sudo tune2fs -o journal_data_ordered /dev/sda1 # Ordered (default) sudo tune2fs -o journal_data_writeback /dev/sda1 # Writeback

2. Extents

Instead of tracking individual blocks, ext4 uses extents - contiguous blocks:

# Traditional (ext3): Block map for 100MB file # Block 1000 → data # Block 1001 → data # Block 1002 → data # ... (25,600 entries for 100MB with 4KB blocks) # # ext4: Extent for 100MB file # Extent: Start block 1000, length 25,600 blocks # (1 entry instead of 25,600!)

Benefits:

  • Less metadata overhead
  • Improved large file performance
  • Reduced fragmentation
  • Faster file deletion

3. Delayed Allocation

ext4 delays block allocation until data is flushed to disk:

# Without delayed allocation: # 1. Application writes data # 2. Blocks allocated immediately # 3. Data may be scattered if disk is fragmented # # With delayed allocation: # 1. Application writes data # 2. Data buffered in memory # 3. Blocks allocated when flushing # 4. Better chance of contiguous allocation

4. Large File Support

  • Maximum file size: 16 TiB (with 4KB blocks)
  • Maximum volume size: 1 EiB
  • Maximum filename length: 255 bytes
  • Maximum path length: 4096 bytes

ext4 Architecture

Block Groups: The Building Blocks of ext4

The Challenge: How do you organize billions of blocks efficiently? How do you enable parallel operations? How do you survive disk corruption?

The Solution: ext4 divides the entire filesystem into self-contained units called Block Groups. Each group is like a mini-filesystem with its own metadata and data blocks. This brilliant design enables parallelism, fault isolation, and efficient disk usage.

Explore how ext4 organizes its block groups and watch file operations in action:

ext4 Block Groups Structure

BG 0
Used45%
BG 1
Used62%
BG 2
Used38%
BG 3
Used71%
BG 4
Used25%
BG 5
Used89%
BG 6
Used54%
BG 7
Used41%

Block Group 0 Structure

Superblock1%
Group Descriptors1%
Block Bitmap1%
Inode Bitmap1%
Inode Table8%
Data Blocks88%

Key Features

Redundant Superblocks
Backup copies in groups 0, 1, 3^n, 5^n, 7^n for recovery
Flex Block Groups
Groups metadata together for better locality (typically 16 groups)
Locality Optimization
Keeps related files in same block group for faster access
Inode Allocation
Each group has its own inode table for parallel operations
128 MB
Block Group Size
32,768
Blocks per Group
8,192
Inodes per Group
4 KB
Default Block Size

Why Block Groups Matter

Parallelism
Multiple processes can allocate from different groups simultaneously
Fault Isolation
Corruption in one group doesn't affect others
Reduced Seeking
Related files stay physically close on disk
Efficient fsck
Can check groups independently for faster recovery

Understanding the Structure

Each block group contains:

  1. Superblock (Group 0, 1, and powers of 3, 5, 7): Critical filesystem metadata
  2. Group Descriptors: Information about all block groups
  3. Block Bitmap: Tracks free/used data blocks (1 bit per block)
  4. Inode Bitmap: Tracks free/used inodes (1 bit per inode)
  5. Inode Table: Actual inode structures
  6. Data Blocks: Where your actual file content lives
# View block group information sudo dumpe2fs /dev/sda1 | head -50 # See block group layout sudo debugfs /dev/sda1 debugfs: stats # Shows groups, free blocks, free inodes, directories

Flex Block Groups

ext4 groups multiple block groups together:

# Check flex_bg setting sudo dumpe2fs /dev/sda1 | grep -i flex # Flex block group size: 16 # Benefits: # - Larger contiguous free space # - Better locality for metadata # - Improved performance

Creating and Managing ext4

Creating ext4 Filesystem

# Basic creation sudo mkfs.ext4 /dev/sdb1 # With options sudo mkfs.ext4 -L "MyData" \ # Label -m 1 \ # 1% reserved blocks -O extent,flex_bg \ # Enable features -E stride=32,stripe-width=64 \ # RAID optimization /dev/sdb1 # For SSDs sudo mkfs.ext4 -O extent,flex_bg,uninit_bg \ -E discard \ # Enable TRIM /dev/sdb1

Tuning ext4

# View current settings sudo tune2fs -l /dev/sdb1 # Change reserved block percentage (default 5%) sudo tune2fs -m 1 /dev/sdb1 # Set maximum mount count before fsck sudo tune2fs -c 50 /dev/sdb1 # Set check interval (days) sudo tune2fs -i 180 /dev/sdb1 # Enable/disable features sudo tune2fs -O has_journal /dev/sdb1 # Enable journaling sudo tune2fs -O ^has_journal /dev/sdb1 # Disable journaling sudo tune2fs -O extent /dev/sdb1 # Enable extents # Set label sudo tune2fs -L "BackupDrive" /dev/sdb1

Mount Options

# Performance options mount -o noatime,nodiratime /dev/sdb1 /mnt # Don't update access times mount -o data=writeback /dev/sdb1 /mnt # Fastest journaling mode mount -o nobarrier /dev/sdb1 /mnt # Disable write barriers (risky) mount -o commit=60 /dev/sdb1 /mnt # Sync every 60 seconds # Security options mount -o noexec,nosuid /dev/sdb1 /mnt # Prevent execution, setuid # For SSDs mount -o discard /dev/sdb1 /mnt # Enable TRIM # Error handling mount -o errors=remount-ro /dev/sdb1 /mnt # Remount read-only on errors mount -o errors=panic /dev/sdb1 /mnt # Kernel panic on errors

ext4 Performance Optimization

For HDDs

# Enable readahead blockdev --setra 4096 /dev/sdb1 # Optimize for large files mount -o extent_cache,delalloc /dev/sdb1 /mnt # Directory indexing for many files tune2fs -O dir_index /dev/sdb1

For SSDs

# Disable journal (risky but faster) tune2fs -O ^has_journal /dev/sdb1 # Enable TRIM mount -o discard /dev/sdb1 /mnt # Or periodic TRIM sudo fstrim -v / # Align to erase blocks mkfs.ext4 -E stride=128,stripe-width=128 /dev/sdb1

For Databases

# Disable access time updates mount -o noatime,nodiratime /dev/sdb1 /mnt # Increase journal size tune2fs -J size=400 /dev/sdb1 # Use data=journal for consistency mount -o data=journal /dev/sdb1 /mnt

Maintenance and Troubleshooting

Filesystem Check

# Check filesystem (must be unmounted) sudo umount /dev/sdb1 sudo e2fsck -f /dev/sdb1 # Force check even if clean sudo e2fsck -f /dev/sdb1 # Automatic repair sudo e2fsck -p /dev/sdb1 # Verbose check sudo e2fsck -v /dev/sdb1 # Check for bad blocks sudo e2fsck -c /dev/sdb1

Resize Operations

# Shrink filesystem (must be unmounted) sudo umount /dev/sdb1 sudo e2fsck -f /dev/sdb1 sudo resize2fs /dev/sdb1 50G # Grow filesystem (can be online) sudo resize2fs /dev/sdb1 # Grow to partition size sudo resize2fs /dev/sdb1 100G # Grow to specific size

Defragmentation

# Check fragmentation sudo e4defrag -c /dev/sdb1 # Defragment filesystem sudo e4defrag /dev/sdb1 # Defragment specific file sudo e4defrag /path/to/file # Defragment directory recursively sudo e4defrag -r /path/to/directory

Backup Superblock

# Find backup superblock locations sudo dumpe2fs /dev/sdb1 | grep -i superblock # Restore from backup superblock sudo e2fsck -b 32768 /dev/sdb1

ext4 Limitations and Considerations

Limitations

  • No built-in snapshots (unlike Btrfs/ZFS)
  • No built-in compression (unlike Btrfs/ZFS)
  • No built-in RAID (use mdadm/LVM instead)
  • Limited to 4 billion files per filesystem
  • No data checksums (can't detect silent corruption)

When to Use ext4

Perfect for:

  • Root filesystem
  • General purpose storage
  • Production servers (proven reliability)
  • When you need stability over features

Consider alternatives for:

  • Large storage pools (→ ZFS)
  • Need snapshots (→ Btrfs/ZFS)
  • Need compression (→ Btrfs/ZFS)
  • Cross-platform (→ exFAT/NTFS)

ext4 vs Other Filesystems

Quick Comparison

Feature ext4 XFS Btrfs ZFS ────────────────────────────────────────── Maturity ████ ████ ███ ████ Performance ████ █████ ███ ███ Features ███ ███ █████ █████ Complexity ██ ██ ████ █████ Resource Usage ██ ██ ███ ████

Performance Benchmarks

# Sequential Write (1GB file) ext4: 520 MB/s ████████████████████ XFS: 540 MB/s █████████████████████ Btrfs: 480 MB/s ██████████████████ ZFS: 460 MB/s █████████████████ # Random 4K Writes ext4: 180 MB/s ████████████████████ XFS: 165 MB/s ██████████████████ Btrfs: 140 MB/s ███████████████ ZFS: 120 MB/s █████████████

Advanced ext4 Features

Inline Data

Small files stored directly in inode:

# Enable inline data tune2fs -O inline_data /dev/sdb1 # Benefits: # - Saves space for tiny files # - Faster access (no extra block read) # - Less fragmentation

Bigalloc

Cluster multiple blocks together:

# Create with bigalloc (cluster size = 64KB) mkfs.ext4 -O bigalloc -C 65536 /dev/sdb1 # Benefits: # - Better for large files # - Reduced metadata overhead # Warning: Wastes space with small files

Metadata Checksums

# Enable metadata checksums tune2fs -O metadata_csum /dev/sdb1 # Protects against: # - Corruption in metadata # - Bit flips in storage

Lazy Initialization

# Create filesystem with lazy init mkfs.ext4 -E lazy_itable_init=1,lazy_journal_init=1 /dev/sdb1 # Benefits: # - Faster filesystem creation # - Background initialization

Monitoring ext4

Real-time Statistics

# View mount options and statistics cat /proc/fs/ext4/sdb1/options cat /proc/fs/ext4/sdb1/mb_groups # Monitor journal activity watch -n 1 'cat /proc/fs/jbd2/sdb1-8/info' # Check lifetime writes (useful for SSDs) cat /sys/fs/ext4/sdb1/lifetime_write_kbytes

Performance Monitoring

# I/O statistics iostat -x 1 /dev/sdb1 # Filesystem activity iotop -o # Cache statistics cat /proc/meminfo | grep -E "Cached|Buffers"

Recovery Techniques

Recover Deleted Files

# Use extundelete sudo extundelete /dev/sdb1 --restore-file path/to/file # Recover all deleted files sudo extundelete /dev/sdb1 --restore-all # Use debugfs sudo debugfs /dev/sdb1 debugfs> lsdel debugfs> undel <inode> path/to/restore

Fix Corrupted Filesystem

# Try automatic repair first sudo e2fsck -p /dev/sdb1 # If that fails, try forced repair sudo e2fsck -y /dev/sdb1 # Last resort - rebuild journal sudo tune2fs -O ^has_journal /dev/sdb1 sudo tune2fs -O has_journal /dev/sdb1

Best Practices

  1. Regular Backups: ext4 is reliable but not indestructible
  2. Use Appropriate Journal Mode: Balance safety vs performance
  3. Monitor Free Space: Keep 10-20% free for performance
  4. Regular Maintenance: Run e2fsck periodically
  5. Align Partitions: Especially important for SSDs
  6. Use Labels/UUIDs: More reliable than device names
  7. Test Recovery: Practice recovery procedures before you need them

Conclusion

ext4 remains the go-to filesystem for Linux because it perfectly balances features, performance, and reliability. While it lacks the advanced features of Btrfs or ZFS, its simplicity is often its strength. For most use cases - from personal computers to production servers - ext4 provides everything you need with minimal complexity.

Its mature codebase, excellent tool support, and proven track record make ext4 the safe choice when you need a filesystem that "just works." As the saying goes: "Nobody ever got fired for choosing ext4."

← Back to Filesystems Overview

If you found this explanation helpful, consider sharing it with others.

Mastodon