RAID: Redundant Arrays for Speed and Safety

The RAID Revolution: When One Disk Isn't Enough

Picture this: It's 1987, and you're running a critical database on a single hard drive. Suddenly, you hear the dreaded click of death. Your drive fails, taking all your data with it. Or maybe your drive works fine, but it's just too slow for your growing needs. Enter RAID - a technology that would revolutionize data storage forever.

RAID (Redundant Array of Independent Disks) combines multiple physical disks into a single logical unit. But here's the magic: depending on how you configure it, RAID can give you blazing speed, bulletproof reliability, or both. It's like having multiple backup singers - they can harmonize for more power, or take over if the lead singer loses their voice.

Today, RAID powers everything from your NAS at home to massive data centers. Let's explore how these disk orchestras work and which arrangement plays the right tune for your needs.

Interactive RAID Visualization

Before diving into the details, explore different RAID levels interactively. Click on disks to simulate failures and watch how each RAID level handles disasters differently:

RAID Level Explorer

RAID 0: Striping

Data striped across disks for maximum speed

💡 Click on a disk to simulate failure

Disk 0

65% used

Disk 1

65% used

Performance

⚡⚡⚡⚡⚡

Reliability

💀

Capacity

100%

Advantages

•2x speed
•Full capacity
•Simple setup

Disadvantages

•No redundancy
•Total data loss if any disk fails
•Not for critical data

RAID Levels Comparison

Level	Min Disks	Capacity	Redundancy	Read Speed	Write Speed	Use Case
RAID 0	2	100%	None	Excellent	Excellent	Video editing, gaming
RAID 1	2	50%	1 disk	Good	Normal	OS drives, critical data
RAID 5	3	66-94%	1 disk	Good	Slow	File servers, NAS
RAID 6	4	50-88%	2 disks	Good	Slower	Large arrays, archives
RAID 10	4	50%	1-2 disks	Excellent	Good	Databases, VMs

Understanding RAID Operations

Write Operations

RAID 0: Data split and written simultaneously to all disks

RAID 1: Same data written to all disks (mirrors)

RAID 5/6: Data + calculated parity written across disks

RAID 10: Data striped across mirror pairs

Failure Recovery

RAID 0: No recovery - total data loss

RAID 1: Read from surviving mirror

RAID 5: Reconstruct from data + parity

RAID 6: Can survive 2 disk failures

Understanding RAID Levels

RAID 0: Living Dangerously for Speed

RAID 0 is the speed demon of the RAID world. It's like having multiple checkout lanes at a grocery store - customers (data) get processed much faster by splitting across all available lanes.

# How RAID 0 distributes data:
# File: "HELLO WORLD"
# 
# Disk 1: H L O W R D
# Disk 2: E L   O L
#
# Read speed: 2x single disk
# Write speed: 2x single disk
# Capacity: 100% (2 x 1TB = 2TB usable)
# Fault tolerance: NONE - any disk failure = total loss

Creating RAID 0

# Create RAID 0 array with mdadm
sudo mdadm --create /dev/md0 \
    --level=0 \
    --raid-devices=2 \
    /dev/sdb /dev/sdc

# Verify creation
cat /proc/mdstat

# Format and mount
sudo mkfs.ext4 /dev/md0
sudo mount /dev/md0 /mnt/raid0

# Check performance
sudo hdparm -t /dev/md0
# Expect ~2x single disk speed

Perfect for:

Video editing scratch disks
Gaming systems (faster loading)
Temporary processing space
Any non-critical data where speed > safety

Never use for:

Important data without backups
System drives
Any data you can't afford to lose

RAID 1: The Mirror Shield

RAID 1 is your data's bodyguard. Every bit of data is written to all disks simultaneously, creating perfect copies. It's like having a stunt double - if the star gets hurt, the show goes on.

# How RAID 1 mirrors data:
# File: "IMPORTANT"
# 
# Disk 1: IMPORTANT
# Disk 2: IMPORTANT (exact copy)
#
# Read speed: Up to 2x (can read from both)
# Write speed: 1x (must write to both)
# Capacity: 50% (2 x 1TB = 1TB usable)
# Fault tolerance: Can lose 1 disk

Creating RAID 1

# Create RAID 1 mirror
sudo mdadm --create /dev/md1 \
    --level=1 \
    --raid-devices=2 \
    /dev/sdb /dev/sdc

# Monitor array status
watch cat /proc/mdstat

# Test redundancy (carefully!)
# Mark disk as failed
sudo mdadm /dev/md1 --fail /dev/sdc
# Remove failed disk
sudo mdadm /dev/md1 --remove /dev/sdc
# Add replacement
sudo mdadm /dev/md1 --add /dev/sdd

Perfect for:

Boot drives
Critical databases
Important documents
Any data that must survive disk failure

RAID 5: The Balanced Approach

RAID 5 uses a clever trick called parity. It's like having a math equation where if you know any N-1 values, you can calculate the missing one. This gives you both speed and redundancy without mirroring's 50% capacity loss.

# How RAID 5 distributes data and parity:
# Data blocks: A, B, C, D
# Parity: P = A ⊕ B ⊕ C ⊕ D (XOR)
#
# Stripe 1:
# Disk 1: A
# Disk 2: B  
# Disk 3: P(A,B)
#
# Stripe 2:
# Disk 1: C
# Disk 2: P(C,D)
# Disk 3: D
#
# If Disk 2 fails:
# B = A ⊕ P(A,B)  (recoverable!)

Creating RAID 5

# Create RAID 5 array (minimum 3 disks)
sudo mdadm --create /dev/md5 \
    --level=5 \
    --raid-devices=3 \
    /dev/sdb /dev/sdc /dev/sdd

# Check rebuild status
cat /proc/mdstat
# md5 : active raid5 sdd[2] sdc[1] sdb[0]
#       2097152 blocks super 1.2 level 5, 512k chunk

# Calculate usable space
# Usable = (N-1) × disk_size
# 3 × 1TB drives = 2TB usable

The RAID 5 Write Penalty

# RAID 5 write process:
# 1. Read old data
# 2. Read old parity
# 3. Calculate new parity
# 4. Write new data
# 5. Write new parity
#
# Result: 4 I/O operations for 1 write!
# This is why RAID 5 writes are slow

Perfect for:

File servers
NAS devices
Read-heavy workloads
Cost-effective redundancy

Avoid for:

Write-intensive databases
Large disks (>2TB) - rebuild risk
Virtual machine storage

RAID 6: Double Protection

RAID 6 extends RAID 5 with double parity. It's like having two different backup calculations - even if two disks fail, your data survives.

# RAID 6 with dual parity:
# P = Standard parity (XOR)
# Q = Reed-Solomon parity
#
# Disk 1: Data A
# Disk 2: Data B
# Disk 3: Parity P
# Disk 4: Parity Q
#
# Can lose ANY 2 disks and still recover!

Creating RAID 6

# Create RAID 6 (minimum 4 disks)
sudo mdadm --create /dev/md6 \
    --level=6 \
    --raid-devices=4 \
    /dev/sdb /dev/sdc /dev/sdd /dev/sde

# More complex parity = slower writes
# But survives 2 disk failures!

# Capacity calculation
# Usable = (N-2) × disk_size
# 4 × 1TB drives = 2TB usable

Perfect for:

Large arrays (8+ disks)
Archive storage
Critical data with slow writes
When rebuild times are long

RAID 10: The Best of Both Worlds

RAID 10 (or RAID 1+0) combines mirroring and striping. It's like having pairs of bodyguards working in parallel - fast and safe.

# RAID 10 structure:
#
# Mirror Pair 1:        Mirror Pair 2:
# Disk 1: A, C         Disk 3: B, D
# Disk 2: A, C (copy)  Disk 4: B, D (copy)
#
# Data striped across pairs, then mirrored
#
# Can lose 1 disk from each pair!

Creating RAID 10

# Create RAID 10 (minimum 4 disks)
sudo mdadm --create /dev/md10 \
    --level=10 \
    --raid-devices=4 \
    /dev/sdb /dev/sdc /dev/sdd /dev/sde

# Excellent performance + redundancy
# But expensive (50% capacity loss)

Perfect for:

Database servers
Virtual machine hosts
High-performance + high-availability
Mission-critical applications

Software RAID with mdadm

Installation and Setup

# Install mdadm
sudo apt install mdadm        # Debian/Ubuntu
sudo dnf install mdadm        # Fedora
sudo pacman -S mdadm          # Arch

# List available disks
lsblk
sudo fdisk -l

# Prepare disks (optional but recommended)
sudo gdisk /dev/sdb
# Create Linux RAID partition (type FD00)

Creating Arrays

# Generic syntax
sudo mdadm --create /dev/mdX \
    --level=RAID_LEVEL \
    --raid-devices=NUM_DEVICES \
    device1 device2 ... deviceN

# With metadata version
sudo mdadm --create /dev/md0 \
    --level=1 \
    --metadata=1.2 \
    --raid-devices=2 \
    /dev/sdb1 /dev/sdc1

# With spare disk
sudo mdadm --create /dev/md0 \
    --level=5 \
    --raid-devices=3 \
    --spare-devices=1 \
    /dev/sdb /dev/sdc /dev/sdd /dev/sde

Monitoring and Management

# Check array status
cat /proc/mdstat
sudo mdadm --detail /dev/md0

# Monitor arrays
sudo mdadm --monitor --daemonize --mail=admin@example.com /dev/md0

# Array health check
sudo mdadm --detail --scan

# Save configuration
sudo mdadm --detail --scan >> /etc/mdadm/mdadm.conf

# Assemble existing array
sudo mdadm --assemble /dev/md0 /dev/sdb /dev/sdc

Handling Failures

# Simulate failure (testing)
sudo mdadm /dev/md0 --fail /dev/sdc

# Check degraded array
cat /proc/mdstat
# md0 : active raid1 sdb[0] sdc[1](F)
#                    ^^^ (F) = Failed

# Remove failed disk
sudo mdadm /dev/md0 --remove /dev/sdc

# Add replacement disk
sudo mdadm /dev/md0 --add /dev/sdd

# Monitor rebuild
watch cat /proc/mdstat

Growing and Reshaping Arrays

# Add disk to existing array
sudo mdadm /dev/md0 --add /dev/sdd

# Grow RAID 5 from 3 to 4 disks
sudo mdadm --grow /dev/md0 --raid-devices=4

# Convert RAID 1 to RAID 5
sudo mdadm --grow /dev/md0 --level=5 --raid-devices=3

# Expand filesystem after growing
sudo resize2fs /dev/md0  # ext4
sudo xfs_growfs /mnt/raid  # XFS

Hardware vs Software RAID

Software RAID (mdadm)

Pros:

Free and flexible
CPU does parity calculations
Easy migration between systems
Works with any disks

Cons:

Uses CPU resources
No battery backup
OS must boot first

Hardware RAID

Pros:

Dedicated processor
Battery-backed cache
Better write performance
OS-independent

Cons:

Expensive controllers
Vendor lock-in
Controller failure risk
Complex recovery

Comparison

# Performance comparison (typical):
#                Software    Hardware
# RAID 0 Read:   95%         100%
# RAID 0 Write:  95%         100%
# RAID 5 Read:   90%         100%
# RAID 5 Write:  60%         95%  (with cache)
# CPU Usage:     5-15%       0%

RAID Performance Tuning

Chunk Size Optimization

# Chunk size affects performance
# Large chunks: Better for sequential I/O
# Small chunks: Better for random I/O

# Create with specific chunk size
sudo mdadm --create /dev/md0 \
    --level=0 \
    --chunk=256 \  # 256KB chunks
    --raid-devices=2 \
    /dev/sdb /dev/sdc

# Benchmark different chunk sizes
for chunk in 64 128 256 512 1024; do
    echo "Testing ${chunk}KB chunks"
    # Create, test, destroy
done

Filesystem Alignment

# Align filesystem to RAID stripe
# stride = chunk size / filesystem block size
# stripe-width = stride × data disks

# For RAID 5 with 3 disks, 256KB chunk, 4KB blocks:
# stride = 256KB / 4KB = 64
# stripe-width = 64 × 2 = 128 (2 data disks in 3-disk RAID 5)

sudo mkfs.ext4 -E stride=64,stripe-width=128 /dev/md0

# For XFS
sudo mkfs.xfs -d su=256k,sw=2 /dev/md0

Read-Ahead Tuning

# Increase read-ahead for sequential workloads
echo 8192 > /sys/block/md0/queue/read_ahead_kb

# Or use blockdev
sudo blockdev --setra 16384 /dev/md0  # 16384 sectors = 8MB

# Make permanent
echo 'SUBSYSTEM=="block", KERNEL=="md0", ACTION=="add|change", ATTR{queue/read_ahead_kb}="8192"' | \
    sudo tee /etc/udev/rules.d/60-md-readahead.rules

Write Cache Settings

# Enable write-back cache (if safe)
sudo mdadm --grow --write-behind=256 /dev/md1  # RAID 1 only

# Check cache settings
cat /sys/block/md0/md/stripe_cache_size

# Increase stripe cache (RAID 5/6)
echo 8192 > /sys/block/md0/md/stripe_cache_size

RAID Maintenance

Scheduled Checks

# Monthly array check (finds bad blocks)
sudo sh -c "echo check > /sys/block/md0/md/sync_action"

# Automate with cron
echo "0 1 * * 0 root echo check > /sys/block/md0/md/sync_action" | \
    sudo tee -a /etc/crontab

# Check progress
cat /proc/mdstat
cat /sys/block/md0/md/sync_completed

Monitoring Scripts

#!/bin/bash
# raid-monitor.sh

# Check all arrays
for md in /sys/block/md*/md; do
    ARRAY=$(basename $(dirname $md))
    STATE=$(cat $md/array_state)
    DEGRADED=$(cat $md/degraded)
    
    if [ "$STATE" != "clean" ] || [ "$DEGRADED" != "0" ]; then
        echo "WARNING: $ARRAY is $STATE, degraded: $DEGRADED"
        # Send alert
        mail -s "RAID Alert: $ARRAY degraded" admin@example.com
    fi
done

Backup Superblocks

# Backup RAID metadata
sudo mdadm --examine --scan > raid-backup.conf

# Backup individual superblocks
for disk in /dev/sd{b,c,d,e}; do
    sudo mdadm --examine $disk > ${disk##*/}-superblock.txt
done

# Restore from backup
sudo mdadm --assemble --scan --config=raid-backup.conf

Advanced RAID Concepts

RAID 50 and 60

# RAID 50 = RAID 5 arrays striped (RAID 0)
# RAID 60 = RAID 6 arrays striped

# Create two RAID 5 arrays
sudo mdadm --create /dev/md1 --level=5 --raid-devices=3 /dev/sd{b,c,d}
sudo mdadm --create /dev/md2 --level=5 --raid-devices=3 /dev/sd{e,f,g}

# Stripe them with RAID 0
sudo mdadm --create /dev/md50 --level=0 --raid-devices=2 /dev/md1 /dev/md2

Write Intent Bitmaps

# Add bitmap for faster rebuilds
sudo mdadm --grow /dev/md0 --bitmap=internal

# External bitmap (on SSD for performance)
sudo mdadm --grow /dev/md0 --bitmap=/mnt/ssd/md0-bitmap

# Check bitmap status
sudo mdadm --detail /dev/md0 | grep -i bitmap

Hot Spares

# Add hot spare to existing array
sudo mdadm /dev/md0 --add-spare /dev/sde

# Create with hot spare
sudo mdadm --create /dev/md0 \
    --level=5 \
    --raid-devices=3 \
    --spare-devices=1 \
    /dev/sd{b,c,d,e}

# Spare automatically replaces failed disk

RAID Decision Matrix

Quick Selection Guide

Scenario	Best RAID	Why
Gaming PC	RAID 0	Maximum speed, games can be reinstalled
Home NAS	RAID 5 or RAID 6	Good capacity vs protection balance
Web Server	RAID 10	Fast reads, good redundancy
Database Server	RAID 10	Fast random I/O, quick rebuilds
Backup Storage	RAID 6	Maximum protection, write speed not critical
Video Editing	RAID 0 + Backup	Speed for active projects, separate backup
Boot Drive	RAID 1	Simple redundancy, fast recovery
Archive Storage	RAID 6	Long-term reliability, handles 2 failures

Capacity Calculator

# RAID 0: N × size
# RAID 1: size (regardless of N)
# RAID 5: (N-1) × size
# RAID 6: (N-2) × size
# RAID 10: (N/2) × size

# Example: 4 × 2TB drives
# RAID 0:  8TB usable (100%)
# RAID 1:  2TB usable (25%)
# RAID 5:  6TB usable (75%)
# RAID 6:  4TB usable (50%)
# RAID 10: 4TB usable (50%)

Common RAID Myths Debunked

"RAID is a backup"

FALSE! RAID protects against hardware failure, not:

Accidental deletion
Ransomware
Corruption
Theft
Natural disasters

Always maintain separate backups!

"RAID 5 is dead for large disks"

PARTIALLY TRUE: Large disks have long rebuild times, increasing risk of second failure. But with proper monitoring and hot spares, RAID 5 can still work. Consider RAID 6 for disks >2TB.

"Hardware RAID is always better"

FALSE! Modern CPUs handle software RAID excellently. Hardware RAID only wins with battery-backed cache for write-intensive workloads.

"RAID 0 doubles failure risk"

TRUE! With two disks, you have 2x the chance of failure. With N disks, it's N times more likely to fail.

Troubleshooting Common Issues

Array Won't Assemble

# Force assembly
sudo mdadm --assemble --force /dev/md0 /dev/sd{b,c,d}

# Recreate with assume-clean (last resort!)
sudo mdadm --create /dev/md0 \
    --assume-clean \
    --level=5 \
    --raid-devices=3 \
    /dev/sd{b,c,d}

Slow Rebuild

# Check rebuild speed limits
cat /proc/sys/dev/raid/speed_limit_min
cat /proc/sys/dev/raid/speed_limit_max

# Increase minimum speed
echo 50000 > /proc/sys/dev/raid/speed_limit_min

Degraded Performance

# Check for misaligned I/O
iostat -x 1

# Verify chunk size matches workload
blockdev --getra /dev/md0

# Check for pending sectors
smartctl -a /dev/sdb | grep -i pending

RAID Best Practices

Test your recovery procedure before you need it
Monitor arrays continuously with mdadm --monitor
Keep spare disks ready for replacement
Use enterprise disks for RAID 5/6 (URE rates matter)
Document your setup including disk serial numbers
Schedule regular scrubs to detect bit rot
Match disk specs (RPM, cache, size)
Consider SSDs for cache or separate arrays
Plan for growth - leave room to expand
RAID ≠ Backup - always have separate backups

Filesystem-Integrated RAID

Modern filesystems like Btrfs and ZFS integrate RAID directly, offering unique advantages over traditional mdadm:

Btrfs RAID

Native RAID support:

# Create Btrfs with RAID1 (data and metadata)
mkfs.btrfs -d raid1 -m raid1 /dev/sda /dev/sdb

# RAID0 for data, RAID1 for metadata
mkfs.btrfs -d raid0 -m raid1 /dev/sda /dev/sdb /dev/sdc

# Add disk to existing array
btrfs device add /dev/sdd /mnt
btrfs balance start -dconvert=raid1 /mnt

Advantages:

Per-file checksums: Detect corruption on read
Self-healing: Auto-repair from good copy
Flexible: Different RAID for data vs metadata
Online reshaping: Add/remove disks live

Limitations:

RAID 5/6 still unstable (as of 2025)
Performance lower than mdadm
Complex space accounting

ZFS RAID-Z

RAID-Z (ZFS's RAID 5/6):

# RAID-Z1 (single parity, like RAID 5)
zpool create tank raidz /dev/sda /dev/sdb /dev/sdc

# RAID-Z2 (double parity, like RAID 6)
zpool create tank raidz2 /dev/sda /dev/sdb /dev/sdc /dev/sdd

# RAID-Z3 (triple parity)
zpool create tank raidz3 /dev/sd{a,b,c,d,e,f}

Advantages:

No write hole: Unlike RAID 5, no corruption risk
Variable stripe width: Optimal for small files
Checksums: End-to-end data integrity
Self-healing: Repair corruption automatically
No rebuild stress: Resilver reads all data

Limitations:

Can't expand vdevs: Must add entire new vdev
RAM hungry: ARC cache needs lots of memory
Slow small writes: Copy-on-Write overhead

Comparison: mdadm vs Btrfs vs ZFS

Feature	mdadm	Btrfs RAID	ZFS RAID-Z
Performance	Best	Good	Good
Checksums	No	Yes (data+metadata)	Yes (end-to-end)
Self-healing	No	Yes	Yes
Write hole	Yes (RAID 5)	Yes (RAID 5/6)	No
Flexibility	High	Very High	Medium
Stability	Excellent	RAID1 stable, 5/6 not	Excellent
Expansion	Easy	Easy	Hard (can't resize)
CPU overhead	Low	Medium	Medium-High
Memory use	Low	Low-Medium	High (ARC)

When to use:

mdadm: Maximum performance, proven stability, traditional RAID
Btrfs RAID: Snapshots + RAID1, desktop/home server
ZFS RAID-Z: Enterprise, data integrity critical, have RAM

See also:

Btrfs: Snapshots and checksums
ZFS: RAID-Z details
Data Integrity: Checksums and scrubbing

Conclusion

RAID transforms multiple disks into powerful, resilient storage systems. Whether you need the raw speed of RAID 0, the reliability of RAID 1, or the balanced approach of RAID 5/6/10, understanding these configurations helps you build storage that matches your needs perfectly.

Remember: RAID is about availability and performance, not backup. It keeps your data accessible when disks fail, but it won't save you from accidental deletion or ransomware. Choose your RAID level based on your specific needs, always maintain backups, and monitor your arrays religiously.

The interactive visualizations showed how each RAID level handles data distribution and failures differently. Now you can confidently design storage systems that provide the perfect balance of speed, capacity, and reliability for your specific use case.

← Back to Filesystems Overview

Table of Contents

RAID Level Explorer

RAID 0: Striping

Advantages

Disadvantages

RAID Levels Comparison

Understanding RAID Operations

Write Operations

Failure Recovery