Linux Kernel Architecture: Core Subsystems Deep Dive

15 min

Master the Linux kernel architecture through interactive visualizations. Explore kernel layers, memory management, process scheduling, VFS, and the complete boot process.

Best viewed on desktop for optimal interactive experience

The Linux Kernel: Heart of the Operating System

The Linux kernel is a monolithic kernel that serves as the core of all Linux distributions. It manages hardware resources, provides essential services to applications, and ensures secure, efficient operation of the entire system.

Interactive Kernel Architecture Explorer

Explore the complete Linux kernel architecture with interactive visualizations:

Linux Kernel Architecture Explorer

Deep dive into the Linux kernel subsystems, memory management, and process scheduling

Linux Kernel Architecture

Protection Rings

  • Ring 0: Kernel mode - full hardware access
  • Ring 3: User mode - restricted access
  • System calls transition from Ring 3 to Ring 0

Key Subsystems

  • Process & thread management
  • Memory allocation & paging
  • Device drivers & I/O
  • Network protocol stack

Linux Kernel Key Concepts

Monolithic Kernel

All kernel services run in kernel space with direct hardware access

Loadable Modules

Extend kernel functionality without recompilation using kernel modules

Everything is a File

Unified interface for devices, processes, and system information

Kernel Architecture Overview

Monolithic Design

Linux uses a monolithic kernel architecture where all kernel services run in kernel space with direct hardware access. This design offers:

  • Performance: Direct function calls instead of message passing
  • Efficiency: Minimal overhead for system operations
  • Complexity: All components tightly integrated
  • Flexibility: Loadable kernel modules for extensibility

vs Microkernel: In contrast, microkernels (like Minix, QNX) run most services in user space with message passing. More modular and isolated, but slower due to IPC overhead.

Protection Rings

The x86 architecture provides four privilege levels (rings), but Linux only uses two:

Ring 0 (Kernel Mode):

  • Full hardware access
  • All CPU instructions available
  • Direct memory manipulation
  • Device driver execution

Ring 3 (User Mode):

  • Restricted instruction set
  • Virtual memory access only
  • System calls required for kernel services
  • Process isolation enforced

Rings 1 & 2: Unused by Linux (originally for device drivers)

Core Kernel Subsystems

1. Process Management

The kernel tracks every process using task_struct - a descriptor containing:

  • Identity: PID, PPID, UID/GID
  • State: RUNNING, READY, WAITING, STOPPED, ZOMBIE
  • Memory: Pointer to memory descriptor (mm_struct)
  • Scheduling: Priority, nice value, CPU time
  • Resources: Open files, signals, limits

Key Components:

  • Scheduler: CFS (Completely Fair Scheduler) for normal processes
  • Fork/Exec: Process creation and program execution
  • Signals: Inter-process communication
  • Threads: Lightweight processes sharing memory

See Process Management for detailed coverage.

2. Memory Management

Linux implements sophisticated virtual memory:

Virtual Address Space Layout (x86-64):

  • 0x0000000000400000 - 0x00007FFFFFFFFFFF: User space (128 TB)
  • 0xFFFF800000000000 - 0xFFFFFFFFFFFFFFFF: Kernel space (128 TB)

Each process gets:

  • Separate virtual address space
  • Text (code), Data, BSS, Heap, Stack regions
  • Memory-mapped files
  • Shared libraries

Memory Zones:

  • ZONE_DMA: First 16MB (legacy ISA devices)
  • ZONE_DMA32: First 4GB (32-bit DMA)
  • ZONE_NORMAL: Normal addressable memory
  • ZONE_HIGHMEM: Above 896MB (32-bit only)

See Memory Management for virtual memory, paging, TLB.

3. Virtual File System (VFS)

The VFS provides a unified interface for all filesystems. It abstracts:

Key Structures:

  • Superblock: Filesystem metadata (total blocks, free blocks, magic number)
  • Inode: File metadata (permissions, owner, size, block pointers)
  • Dentry: Directory entry cache (name → inode mapping)
  • File: Open file descriptor (position, flags)

Why VFS? Applications call open(), read(), write() - VFS translates to filesystem-specific operations. Works with ext4, NTFS, NFS, tmpfs, /proc, etc.

See Filesystems Overview for details.

4. Network Stack

Complete TCP/IP implementation in kernel space:

Layers:

  1. Application: Sockets API
  2. Transport: TCP, UDP, SCTP
  3. Network: IPv4, IPv6, ICMP, routing
  4. Link: Ethernet, WiFi drivers
  5. Physical: Hardware NICs

Netfilter/iptables: Packet filtering hooks at various points in the stack.

See Networking Stack for comprehensive coverage.

5. Device Drivers

Interface between kernel and hardware:

Driver Types:

  • Character devices: Stream of bytes (serial ports, terminals, /dev/random)
  • Block devices: Random access (hard drives, SSDs, /dev/sda)
  • Network devices: Network interfaces (eth0, wlan0)

Drivers can be:

  • Built-in: Compiled into kernel
  • Modules: Loaded dynamically (.ko files)

System Call Interface

System calls are the only way user programs access kernel services:

Common syscalls:

  • File I/O: open(), read(), write(), close()
  • Process: fork(), exec(), exit(), wait()
  • Memory: mmap(), brk(), munmap()
  • Network: socket(), connect(), send(), recv()

Mechanism: syscall instruction (x86-64) triggers mode switch from Ring 3 → Ring 0.

See System Calls for complete user→kernel journey.

Kernel Boot Process

Quick overview (see Boot Process for comprehensive visualization):

  1. BIOS/UEFI: POST, hardware init, load bootloader
  2. Bootloader (GRUB): Load kernel + initramfs into memory
  3. Kernel Init: Decompress, initialize subsystems (mm, scheduler, VFS)
  4. initramfs: Temporary root, load drivers, mount real root
  5. Init (systemd): Start services, reach boot target
  6. User Space: Login prompt

Kernel Configuration & Building

# Download kernel source wget https://kernel.org/pub/linux/kernel/v6.x/linux-6.6.tar.xz tar -xf linux-6.6.tar.xz && cd linux-6.6 # Configure (choose features/drivers) make menuconfig # Interactive menu make oldconfig # Use existing .config make defconfig # Default config for architecture # Build make -j$(nproc) # Compile kernel make modules_install # Install modules to /lib/modules/ make install # Install kernel to /boot/ # Update bootloader grub-mkconfig -o /boot/grub/grub.cfg

Common Kernel Parameters

# Boot parameters (add to GRUB) quiet # Suppress verbose messages splash # Show splash screen init=/bin/bash # Emergency shell single # Single user mode nomodeset # Disable kernel mode setting (GPU issues) rootdelay=10 # Wait 10s for root device

Kernel Modules

Extend kernel without recompiling:

# List loaded modules lsmod | less # Load module insmod /path/to/module.ko # Low-level modprobe module_name # Handles dependencies # Remove module rmmod module_name modprobe -r module_name # Module information modinfo ext4 # Show module details lsmod | grep nvidia # Check if loaded

See Kernel Modules for development details.

Performance & Debugging

Kernel Tracing

# Function tracing echo function > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/trace # System call tracing strace ls # Trace syscalls strace -c ls # Count syscalls strace -e open,read ls # Specific syscalls # Performance events perf record -e syscalls:* ./app perf report

/proc Filesystem

Virtual filesystem exposing kernel data:

/proc/cpuinfo # CPU information /proc/meminfo # Memory statistics /proc/modules # Loaded modules (same as lsmod) /proc/kallsyms # Kernel symbol table /proc/sys/ # Kernel tunables (sysctl) /proc/[PID]/ # Per-process info

sysfs Interface

Device and driver information:

/sys/class/ # Device classes (block, net, etc.) /sys/devices/ # Device hierarchy /sys/module/ # Module parameters /sys/kernel/ # Kernel subsystems

Security Features

Linux Security Modules (LSM)

Framework for implementing MAC (Mandatory Access Control):

  • SELinux: Complex, fine-grained control (Red Hat, Fedora)
  • AppArmor: Path-based, easier to configure (Ubuntu, SUSE)
  • SMACK: Simplified MAC (embedded systems)

Kernel Hardening

# Security tunables (via /etc/sysctl.conf) kernel.kptr_restrict=2 # Hide kernel pointers kernel.dmesg_restrict=1 # Restrict dmesg kernel.yama.ptrace_scope=1 # Limit ptrace kernel.unprivileged_bpf_disabled=1 # Disable unprivileged BPF kernel.kexec_load_disabled=1 # Prevent kexec

Modern Kernel Features

cgroups (Control Groups)

Resource limiting and accounting:

# CPU limitation (50% of one core) echo 50000 > /sys/fs/cgroup/cpu/mygroup/cpu.cfs_quota_us # Memory limitation (1GB) echo 1G > /sys/fs/cgroup/memory/mygroup/memory.limit_in_bytes # Used by Docker, systemd for resource control

Namespaces

Process isolation (containers use these):

  • PID: Separate process ID space
  • Network: Isolated network stack
  • Mount: Separate filesystem mounts
  • UTS: Hostname and domain name
  • IPC: Inter-process communication
  • User: User and group ID mapping

eBPF (Extended Berkeley Packet Filter)

Safe in-kernel programs for tracing, networking, security:

  • No kernel recompilation needed
  • Verified safe by eBPF verifier
  • High performance (runs in kernel)
  • Used by: bpftrace, Cilium, Falco

Best Practices

Kernel Development

  1. Follow Linux coding style (scripts/checkpatch.pl)
  2. Use appropriate locking (spinlocks, mutexes, RCU)
  3. Minimize time in interrupt context
  4. Never sleep in atomic context
  5. Test with different configs (debug, lockdep)

System Administration

  1. Security: Keep kernel updated (CVEs)
  2. Stability: Use LTS kernels for production
  3. Monitoring: dmesg, journalctl -k for kernel logs
  4. Tuning: Adjust /proc/sys/ parameters carefully
  5. Documentation: Document custom configurations

If you found this explanation helpful, consider sharing it with others.

Mastodon