Filesystem Data Integrity: Checksums, Scrubbing, and Silent Corruption Detection

12 min

Understand how modern filesystems protect against data corruption with checksums, scrubbing, and error correction. Explore integrity mechanisms through interactive visualizations.

Best viewed on desktop for optimal interactive experience

The Silent Data Corruption Problem

Traditional filesystems (ext4, XFS, FAT) have a critical flaw: they trust the storage layer. If a disk returns corrupted data, the filesystem serves it—no questions asked.

Sources of corruption:

  • Bit rot: Cosmic rays, magnetic decay, aging
  • Buggy firmware: RAID controller errors, SSD bugs
  • Silent failures: Disk returns wrong data (no error reported)
  • Memory errors: Corrupted during transfer (no ECC RAM)
  • Misdirected writes: Wrong block written (cache/firmware bugs)

The problem: Traditional filesystems detect corruption only during reads—and often not even then.

Modern Integrity Solutions

Checksum-based filesystems (ZFS, Btrfs, ReFS) solve this with:

  1. End-to-End Checksums: Verify data from disk to application
  2. Self-Healing: Automatic corruption repair (with redundancy)
  3. Scrubbing: Proactive corruption detection
  4. Metadata Protection: Checksums for all metadata too

How Data Integrity Works: Interactive Exploration

See checksum verification, corruption detection, and self-healing in action:

Interactive Data Integrity Demo

Checksum Detection: Finding Silent Corruption

Step 1: Initial Write (Checksum Computed)

Storage (Disk 1):
Block:5280
Data:Original PDF Data
Metadata (Parent):
Pointer:Block 5280
Checksum:abc123def456
Parent metadata (separate from data)
What's happening:
  • Application writes PDF data (128KB)
  • Filesystem computes checksum: sha256(data) = abc123def456
  • Write data to Block 5280
  • Store checksum in PARENT metadata (not with data)
  • Separation ensures corruption can't hide
Step 1 of 4

Checksum Mechanisms

ZFS Checksums

Every block checksummed:

Data Block: [file data 128KB] Checksum: sha256(data) Location: Stored in parent metadata (NOT with data)

Why parent storage?

  • Corruption can't affect its own checksum
  • Read path: Fetch metadata (checksum) → Fetch data → Verify
  • Mismatch = Corruption detected

Checksum algorithms:

  • fletcher2: Fast, weak (legacy)
  • fletcher4: Fast, good (default)
  • sha256: Strong, slower (critical data)
  • sha512: Strongest, slowest

Configure per dataset:

zfs set checksum=sha256 tank/important zfs set checksum=fletcher4 tank/bulk # Default

Btrfs Checksums

Data and metadata checksummed:

Checksum: crc32c (default) Alternatives: xxhash, sha256, blake2 Location: Stored in parent tree node

Configurable:

# Set checksum algorithm at mkfs mkfs.btrfs --checksum xxhash /dev/sda1 # Or per-file (via properties) btrfs property set /path/to/file checksum sha256

Nodatasum option:

# Disable checksums for specific files (faster, no protection) chattr +C /var/lib/mysql/data # Also disables CoW

ext4 Metadata Checksums

ext4 has limited checksums (metadata only):

# Enable metadata checksums at format mkfs.ext4 -O metadata_csum /dev/sda1 # Or convert existing filesystem tune2fs -O metadata_csum /dev/sda1 # Requires e2fsck

What's protected:

  • Superblock
  • Group descriptors
  • Inode tables
  • Directory entries
  • Journal

What's NOT protected:

  • File data (no data checksums!)
  • Can't detect silent data corruption

Corruption Detection Flow

Read Path with Checksums

Traditional filesystem (ext4):

1. Application: read(file, offset) 2. Filesystem: Lookup block number 3. Disk: Return block data 4. Filesystem: Return to application ❌ No verification - corrupt data silently served

Checksum filesystem (ZFS/Btrfs):

1. Application: read(file, offset) 2. Filesystem: Lookup block + checksum 3. Disk: Return block data 4. Filesystem: Compute checksum of data 5. Compare: Computed vs Stored checksum ✅ Match → Return data ❌ Mismatch → Corruption detected! 6. If redundant copy exists: - Try mirror/parity copy - Verify checksum - Return good copy - Repair corrupted copy

Write Path with Checksums

ZFS/Btrfs write:

1. Application: write(data) 2. Filesystem: Compute checksum(data) 3. Write data to new location (CoW) 4. Update parent metadata with: - Pointer to new data block - Checksum value 5. Commit transaction (atomic)

Integrity guarantee:

  • Checksum stored BEFORE data is referenced
  • Corruption during write detected on next read
  • Old data preserved (CoW) until commit

Self-Healing

Requirements for Self-Healing

Need redundancy:

  • RAID-1/10: Mirror copies
  • RAID-5/6: Parity reconstruction
  • ZFS RAID-Z: Parity with checksums
  • Btrfs RAID: Mirror or RAID-5/6

Self-healing flow:

1. Read block from disk 1 2. Checksum mismatch → Corruption! 3. Try mirror copy (disk 2) 4. Checksum matches → Good copy found 5. Repair corrupted block: - Write good data to disk 1 - Verify checksum 6. Log repair: "Corrected 1 block"

ZFS Self-Healing

Automatic on every read:

# Read file - automatic healing if corrupted cat /tank/data/file.txt # ZFS detects corruption, repairs from mirror/parity # Check healing stats zpool status -v tank # Shows: "X blocks repaired"

Scrub for proactive healing:

# Read and verify EVERYTHING zpool scrub tank # Monitor progress zpool status tank # Shows: scan: scrub in progress, 45% done

Btrfs Self-Healing

Automatic on read (with RAID):

# File read with corruption cat /mnt/btrfs/file # Btrfs: detects corruption, repairs from mirror # Check errors btrfs device stats /mnt # Shows corruption/repair counts per device

Scrub for proactive healing:

# Scrub all data and metadata btrfs scrub start /mnt # Monitor progress btrfs scrub status /mnt # Shows: "X errors found, Y corrected"

Scrubbing: Proactive Verification

What Is Scrubbing?

Scrub = Read every block, verify checksums, repair corruption

Purpose:

  • Find corruption before you need the data
  • Detect bit rot early (before spreading)
  • Verify RAID parity consistency
  • Background integrity maintenance

ZFS Scrubbing

Manual scrub:

# Start scrub zpool scrub tank # Check status zpool status tank # Output: # scan: scrub in progress since Sun Jan 10 12:00:00 2025 # 45.2G scanned at 1.5G/s, 12.1G to go # 0 repaired, 78.9% done # Stop scrub (if needed) zpool scrub -s tank

Automatic scrubbing (recommended):

# Weekly scrub via systemd timer systemctl enable zfs-scrub@tank.timer systemctl start zfs-scrub@tank.timer # Or via cron 0 2 * * 0 zpool scrub tank # Every Sunday 2 AM

Scrub results:

zpool status -v tank # Shows: # errors: No known data errors ✅ # OR # errors: Permanent errors in: # /tank/important/file.txt ❌ # (1 corrupted block, no redundancy to repair)

Btrfs Scrubbing

Manual scrub:

# Start scrub btrfs scrub start /mnt # Monitor btrfs scrub status /mnt # Output: # Scrub started: Sun Jan 10 12:00:00 2025 # Status: running # Total to scan: 100GB # ... # Wait for completion btrfs scrub status -d /mnt # Detailed stats

Automatic scrubbing:

# Monthly scrub via systemd systemctl enable btrfs-scrub@mnt.timer # Or via cron 0 3 1 * * btrfs scrub start /mnt # Monthly

Scrub results:

btrfs scrub status -d /mnt # Shows per-device: # Data extents scrubbed: 12345 # Checksum errors: 10 # Corrected errors: 10 ✅ # Uncorrectable errors: 0

Corruption Types and Detection

Detectable Corruption

With checksums (ZFS/Btrfs):

  • ✅ Bit flips in data
  • ✅ Bit flips in metadata
  • ✅ Misdirected writes (block written to wrong location)
  • ✅ Torn writes (partial block write)
  • ✅ Firmware bugs returning wrong data
  • ✅ Memory corruption during transfer

Undetectable Corruption

Even with checksums:

  • ❌ Corruption during write (before checksum computed)
    • Mitigation: ECC RAM
  • ❌ Application writes wrong data
    • Mitigation: Application-level checksums
  • ❌ Encryption key corruption
    • Mitigation: Key backup and verification

Unrecoverable Corruption

Corruption detected but can't repair:

1. Read block: Checksum mismatch 2. Try redundant copy: Also corrupted (or doesn't exist) 3. Try parity reconstruction: Parity also corrupted 4. Result: Permanent data loss

ZFS response:

zpool status -v tank # errors: Permanent errors have been detected in the following files: # /tank/data/important.txt

Btrfs response:

# Read returns: I/O error # dmesg shows: "checksum error, no good copy found"

Checksum Overhead

Performance Impact

Read path:

  • Checksum verification: ~1-5% CPU overhead
  • Modern CPUs (with AES-NI, SSE4.2): Negligible
  • Usually bottlenecked by disk, not checksum

Write path:

  • Checksum computation: ~2-10% CPU overhead
  • Depends on algorithm (fletcher4 < sha256)
  • Often hidden by disk latency

Benchmarks:

No checksum (ext4): 1000 MB/s read ZFS (fletcher4): 980 MB/s read (-2%) ZFS (sha256): 920 MB/s read (-8%) Btrfs (crc32c): 990 MB/s read (-1%) Bottleneck: Usually disk speed, not checksum

Space Overhead

Checksum storage:

  • ZFS: 1/1024 blocks (~0.1% overhead)
  • Btrfs: Stored in metadata (~0.5% overhead)
  • ext4 metadata_csum: less than 1% for metadata only

Negligible space cost for significant protection

Comparison: Integrity Features

FilesystemData ChecksumsMetadata ChecksumsSelf-HealingScrubbing
ext4❌ No✅ Optional❌ No❌ No
XFS❌ No✅ Yes❌ No❌ No
Btrfs✅ Yes✅ Yes✅ With RAID✅ Yes
ZFS✅ Yes✅ Yes✅ With RAID✅ Yes
NTFS❌ No❌ No❌ No❌ No
APFS✅ Yes✅ Yes✅ With RAID✅ Yes

Integrity leaders: ZFS, Btrfs, APFS

Best Practices

1. Use Checksums

For critical data:

# ZFS: Use strong checksums zfs set checksum=sha256 tank/important # Btrfs: Enable checksums (default) mkfs.btrfs /dev/sda1 # crc32c enabled

2. Enable Redundancy

Checksums detect corruption, redundancy repairs it:

# ZFS: RAID-Z or mirror zpool create tank raidz /dev/sda /dev/sdb /dev/sdc # Btrfs: RAID1 or RAID10 mkfs.btrfs -d raid1 -m raid1 /dev/sda /dev/sdb

3. Regular Scrubbing

Monthly for normal use, weekly for critical:

# ZFS monthly scrub systemctl enable zfs-scrub@tank.timer # Btrfs weekly scrub 0 3 * * 0 btrfs scrub start /mnt

4. Monitor Errors

Check for corruption regularly:

# ZFS zpool status -v tank | grep -i error # Btrfs btrfs device stats /mnt

5. Use ECC RAM

Protect in-memory data:

  • Checksums protect on-disk data
  • ECC RAM protects in-memory data
  • Recommended for ZFS/Btrfs servers

6. Test Restores

Verify backups can detect corruption:

# Scrub before backup zpool scrub tank # Wait for completion # Then backup (ensures no silent corruption)

Limitations

What Checksums Don't Protect

  1. Application-level corruption: App writes wrong data

    • Solution: Application checksums (e.g., database checksums)
  2. Corruption during write: Data corrupted before checksum computed

    • Solution: ECC RAM
  3. No redundancy: Can detect but not repair

    • Solution: RAID or replication
  4. Complete disk failure: All copies lost

    • Solution: Offsite backups

Performance Considerations

When to disable checksums:

  • Never for metadata (always checksum metadata)
  • Rarely for data (only if proven bottleneck)
  • Databases: May have own checksums (DM-Integrity or database-level)

Disable data checksums (Btrfs):

# Per-file (also disables CoW) chattr +C /var/lib/mysql/data # Or mount option (entire filesystem) mount -o nodatasum /dev/sda1 /mnt

Advanced: DM-Integrity (ext4/XFS)

Device-mapper integrity for non-checksum filesystems:

# Create integrity device integritysetup format /dev/sda1 integritysetup open /dev/sda1 integrity-dev # Format with ext4 mkfs.ext4 /dev/mapper/integrity-dev # Mount mount /dev/mapper/integrity-dev /mnt

Provides:

  • Block-level checksums (below filesystem)
  • Works with any filesystem
  • Performance: 10-30% overhead
  • See: man integritysetup
  • Copy-on-Write: Enables atomic checksum updates
  • Snapshots: Immutable copies for data protection
  • ZFS: End-to-end checksums and self-healing
  • Btrfs: Checksums and scrubbing
  • RAID: Redundancy for self-healing

Key Takeaways

  • Silent Corruption: Traditional filesystems serve corrupted data unknowingly
  • Checksums: Detect corruption at read time (ZFS, Btrfs, APFS)
  • Self-Healing: Automatic repair with redundancy (RAID)
  • Scrubbing: Proactive verification (find corruption early)
  • Overhead: Minimal (~1-5% CPU, less than 1% space)
  • Best Practice: Checksums + Redundancy + Scrubbing + ECC RAM
  • Limitations: Can't fix corruption without redundancy
  • ext4/XFS: Use DM-Integrity for block-level checksums

If you found this explanation helpful, consider sharing it with others.

Mastodon