Sitemap

A visual representation of the site structure to help you navigate through the content.

Site Structure

Home/

Main landing page with introduction and recent articles

About/about

Learn more about me, my background, and expertise

Speaking/speaking

My talks, presentations, and speaking engagements

Articles/articles

Collection of articles I've written on various topics

H264 implementation applications/articles/h264-implementation-applications

Article content

H264 transform quantization/articles/h264-transform-quantization

Article content

H264 fundamentals/articles/h264-fundamentals

Article content

Zettel/articles/zettel

Article content

Compiling pytorch kernel/articles/compiling-pytorch-kernel

Article content

View size not compatible/articles/view-size-not-compatible

Article content

Gpu boot errors/articles/gpu-boot-errors

Article content

H264 interactive guide/articles/h264-interactive-guide

Article content

Ggml structure/articles/ggml-structure

Article content

Quantization deep dive/articles/quantization-deep-dive

Article content

How tensorrt works/articles/how-tensorrt-works

Article content

Kernel fusion/articles/kernel-fusion

Article content

Visualizing yolov5/articles/visualizing-yolov5

Article content

Cpython internals/articles/cpython-internals

Article content

Cpp compilation process/articles/cpp-compilation-process

Article content

Cpp loading runtime/articles/cpp-loading-runtime

Article content

Cpp linking in depth/articles/cpp-linking-in-depth

Article content

Registry pattern/articles/registry-pattern

Article content

Magic numbers/articles/magic-numbers

Article content

Image encoding/articles/image-encoding

Article content

Text encoding/articles/text-encoding

Article content

Papers/papers

Research papers and publications

Visual instruction tuning/papers/visual-instruction-tuning

Paper content

Vit object detection/papers/vit-object-detection

Paper content

Yolo/papers/yolo

Paper content

Efficientnet/papers/efficientnet

Paper content

Faster rcnn/papers/faster-rcnn

Paper content

Sam/papers/sam

Paper content

DETR/papers/DETR

Paper content

Blip2/papers/blip2

Paper content

Image worth 16x16/papers/image-worth-16x16

Paper content

Optimizing transformer inference/papers/optimizing-transformer-inference

Paper content

Surf/papers/surf

Paper content

Swin transformer/papers/swin-transformer

Paper content

Clip/papers/clip

Paper content

Deeplearning go brr/papers/deeplearning-go-brr

Paper content

Attention is all you need/papers/attention-is-all-you-need

Paper content

Data movement transformer/papers/data-movement-transformer

Paper content

Deep residual learning/papers/deep-residual-learning

Paper content

Concepts/concepts

Interactive explanations of machine learning concepts

initramfs: The Initial RAM Filesystem Explained/concepts/linux/initramfs-boot-process

Deep dive into initramfs (initial RAM filesystem), understanding the Linux boot process, early userspace, and how the kernel transitions from boot to the real root filesystem.

Linux Kernel Architecture: Core Subsystems Deep Dive/concepts/linux/kernel-architecture

Master the Linux kernel architecture through interactive visualizations. Explore kernel layers, memory management, process scheduling, VFS, and the complete boot process.

How RAM Works: Interactive Deep Dive into Computer Memory/concepts/memory/how-ram-works

Explore the inner workings of RAM through beautiful animations and interactive visualizations. Understand memory cells, addressing, and the memory hierarchy.

Python Bytecode Compilation/concepts/python/bytecode-compilation

How Python compiles source code to bytecode and executes it

High Bandwidth Memory (HBM)/concepts/gpu/hbm-memory

3D-stacked DRAM architecture providing massive bandwidth for GPUs and AI accelerators

GPU Memory Hierarchy & Optimization/concepts/gpu/memory-hierarchy

Master GPU memory hierarchy from registers to global memory, understand coalescing patterns, bank conflicts, and optimization strategies for maximum performance

Filesystems: The Digital DNA of Data Storage/concepts/linux/filesystems-overview

Explore how filesystems work through beautiful interactive visualizations. From the VFS abstraction layer to modern CoW filesystems like Btrfs and ZFS. Understand the magic behind storing and retrieving your data.

Python Memory Management/concepts/python/memory-management

How CPython manages memory with PyMalloc, object pools, and reference counting

NVIDIA Unified Virtual Memory/concepts/gpu/unified-memory

Automatic memory management between CPU and GPU through page faulting and on-demand migration

Filesystem Journaling: How Write-Ahead Logging Prevents Data Loss/concepts/linux/filesystem-journaling

Understand the journaling mechanism that protects filesystems from crashes. Explore write-ahead logging, transaction states, and crash recovery through interactive visualizations.

Inodes: The Hidden Metadata That Powers Every File/concepts/linux/inodes

Discover inodes through interactive visualizations—the invisible data structures that track everything about your files except their names. Learn why running out of inodes means no new files, even with free space!

Global Interpreter Lock (GIL)/concepts/python/global-interpreter-lock

Understanding Python's GIL, its impact on multithreading, and workarounds

Page Migration & Fault Handling/concepts/gpu/page-migration

Understanding virtual memory page migration, fault handling, and TLB management in CPU-GPU systems

Copy-on-Write (CoW): Never Overwrite, Always Preserve/concepts/linux/copy-on-write

Discover how Copy-on-Write filesystems enable instant snapshots, atomic operations, and data integrity by never overwriting existing data. Explore CoW mechanics through interactive visualizations.

FUSE: Filesystem in Userspace Explained/concepts/linux/fuse-filesystem

Understand FUSE (Filesystem in Userspace) - the framework that lets you implement filesystems without writing kernel code. Learn how NTFS, SSHFS, and cloud storage work on Linux.

Python Object Model/concepts/python/object-model

Understanding PyObject, type system, and how Python objects work internally

ext4: The Linux Workhorse Filesystem/concepts/linux/ext4-filesystem

Deep dive into ext4 (fourth extended filesystem) - the default filesystem for most Linux distributions. Learn about journaling, extents, and why ext4 remains the reliable choice.

Filesystem Snapshots: Time Travel for Your Data/concepts/linux/filesystem-snapshots

Understand how modern filesystems create instant, space-efficient snapshots. Explore snapshot mechanics, rollback operations, and backup strategies through interactive visualizations.

Python Garbage Collection/concepts/python/garbage-collection

How Python handles memory cleanup with reference counting and cyclic GC

Mount Options: Fine-Tuning Filesystem Behavior and Performance/concepts/linux/mount-options

Master filesystem mount options to control performance, security, and behavior. Explore common options, performance implications, and security hardening through interactive visualizations.

NTFS on Linux: Windows Filesystem Support/concepts/linux/ntfs-filesystem

Understand NTFS (New Technology File System) and how Linux provides support through NTFS-3G FUSE driver. Learn about MFT, alternate data streams, and cross-platform challenges.

CPU Pipeline Architecture/concepts/performance/cpu-pipeline-detailed

Comprehensive exploration of CPU pipeline stages, hazards, superscalar, and out-of-order execution

Python Optimization Techniques/concepts/python/python-optimization

Performance optimization strategies and CPython optimizations

Contrastive Learning/concepts/embeddings/contrastive-learning

Learn representations by pulling similar samples together and pushing dissimilar ones apart

Btrfs: Modern Copy-on-Write Filesystem/concepts/linux/btrfs-filesystem

Explore Btrfs (B-tree filesystem) - the modern Linux filesystem with built-in snapshots, RAID, compression, and advanced features for data integrity and flexibility.

Filesystem Data Integrity: Checksums, Scrubbing, and Silent Corruption Detection/concepts/linux/filesystem-integrity

Understand how modern filesystems protect against data corruption with checksums, scrubbing, and error correction. Explore integrity mechanisms through interactive visualizations.

Cross-Lingual Alignment/concepts/embeddings/cross-lingual-alignment

Align embeddings across languages for multilingual understanding

NVIDIA Device Files in /dev//concepts/gpu/nvidia-device-files

Understanding character devices, major/minor numbers, and the device file hierarchy created by NVIDIA drivers for GPU access in Linux.

Filesystem Compression: Transparent Space Savings and Performance Trade-offs/concepts/linux/filesystem-compression

Master transparent filesystem compression with algorithms, compression ratios, and performance implications. Explore compression mechanisms through interactive visualizations.

ZFS: The Ultimate Filesystem/concepts/linux/zfs-filesystem

Deep dive into ZFS (Zettabyte File System) - the most advanced filesystem with unmatched data integrity, pooled storage, snapshots, and enterprise features.

Green Threads vs OS Threads: Understanding Concurrency Models/concepts/python/green-threads-vs-os-threads

Deep dive into the differences between green threads (user-space threads) and OS threads (kernel threads), with interactive visualizations showing scheduling, context switching, and performance implications.

Domain Adaptation/concepts/embeddings/domain-adaptation

Adapt embeddings from source to target domains while preserving knowledge

XFS: High-Performance Filesystem/concepts/linux/xfs-filesystem

Explore XFS - the high-performance 64-bit journaling filesystem optimized for large files and parallel I/O. Learn why it excels at handling massive data workloads.

Python asyncio: Mastering Asynchronous Programming/concepts/python/asyncio-event-loop

Deep dive into Python's asyncio library, understanding event loops, coroutines, tasks, and async/await patterns with interactive visualizations.

Binary Embeddings/concepts/embeddings/binary-embeddings

Ultra-compact 1-bit representations for massive-scale retrieval

FAT32 & exFAT: Universal Filesystems/concepts/linux/fat-filesystems

Understand FAT32 and exFAT - the universal filesystems for cross-platform compatibility. Learn their limitations, use cases, and why they remain essential for removable media.

Hybrid Retrieval Systems/concepts/embeddings/hybrid-retrieval-systems

Combining sparse and dense retrieval for optimal search performance

RAID: Redundant Arrays for Speed and Safety/concepts/linux/raid-storage

Master RAID storage through interactive visualizations. Understand RAID 0, 1, 5, 6, and 10 - how they work, when to use them, and what happens during disk failures.

Memory Controllers: The Brain Behind RAM Management/concepts/memory/memory-controllers

Explore how memory controllers orchestrate data flow between CPU and RAM. Interactive visualizations of channels, ranks, banks, and the complex scheduling that maximizes memory bandwidth.

BM25 Algorithm/concepts/embeddings/bm25-algorithm

Probabilistic ranking function for information retrieval with term frequency saturation

Linux Process Management: Fork, Exec, and Beyond/concepts/linux/process-management

Master Linux process management through interactive visualizations. Understand process lifecycle, fork/exec operations, zombies, orphans, and CPU scheduling.

Linux Memory Management: Virtual Memory, Paging, and Beyond/concepts/linux/memory-management

Explore Linux memory management through interactive visualizations. Understand virtual memory, page tables, TLB, swapping, and memory allocation.

Linux System Calls: The User-Kernel Interface/concepts/linux/system-calls

Understand Linux system calls through interactive visualizations. Learn how user programs communicate with the kernel, protection rings, and syscall performance.

Linux Networking Stack: From Packets to Applications/concepts/linux/networking-stack

Master the Linux networking stack through interactive visualizations. Understand TCP/IP layers, sockets, iptables, routing, and network namespaces.

Linux Boot Process: From Power-On to Login/concepts/linux/boot-process

Understand the Linux boot process through interactive visualizations. Learn about BIOS/UEFI, bootloaders, kernel initialization, and the journey to userspace.

Linux Init Systems: From SysV to systemd/concepts/linux/init-systems

Compare Linux init systems through interactive visualizations. Understand the evolution from SysV Init to systemd, service management, and boot orchestration.

Linux Kernel Modules: Extending the Kernel at Runtime/concepts/linux/kernel-modules

Master Linux kernel modules through interactive visualizations. Learn how to load, unload, develop, and debug kernel modules that extend Linux functionality.

Wayland vs X11: Modern Display Server Architecture/concepts/linux/wayland-x11

Understand the fundamental differences between X11 and Wayland display servers. Learn about architecture, performance, security, and why Wayland represents the future of Linux graphics.

Understanding nvidia-modeset: Kernel Mode-Setting for NVIDIA GPUs/concepts/linux/nvidia-modeset

Deep dive into nvidia-modeset, the NVIDIA kernel module that handles display mode-setting, monitor configuration, and DRM integration in Linux systems.

__slots__ Optimization/concepts/python/slots-optimization

Understanding Python's __slots__ for memory optimization and faster attribute access

Calculus for Machine Learning/concepts/math-for-ml/calculus-basics

Essential calculus concepts for understanding gradients, optimization, and backpropagation

CUDA Multi-Process Service (MPS)/concepts/gpu/cuda-mps

Understand NVIDIA CUDA Multi-Process Service (MPS), a client-server architecture that enables multiple CUDA processes to share a single GPU context for concurrent kernel execution and better utilization.

Understanding TCP/IP Protocol Stack/concepts/networking/tcp-ip

Explore the TCP/IP protocol stack, packet encapsulation, and how data travels through network layers from application to physical transmission.

Flynn's Classification: Taxonomy of Computer Architectures/concepts/performance/flynns-classification

Explore Flynn's Classification of computer architectures through interactive visualizations of SISD, SIMD, MISD, and MIMD systems.

Thread Safety: Concurrent Programming Fundamentals/concepts/cpp/thread-safety

Master thread safety concepts through interactive visualizations of race conditions, mutexes, atomic operations, and deadlock scenarios.

Binary Search Trees: Self-Balancing Data Structures/concepts/data-structures/binary-search-trees

Master binary search trees through interactive visualizations of insertions, deletions, rotations, and self-balancing algorithms like AVL and Red-Black trees.

Hash Tables: Fast Lookups with Collision Resolution/concepts/data-structures/hash-tables

Master hash tables through interactive visualizations of hash functions, collision resolution strategies, load factors, and performance characteristics.

Convolution Operation: The Foundation of CNNs/concepts/deep-learning/convolution-operation

Master the convolution operation through interactive visualizations of sliding windows, feature detection, and the mathematical mechanics behind convolutional neural networks.

Cross-Entropy Loss: The Foundation of Classification/concepts/deep-learning/cross-entropy-loss

Understand cross-entropy loss through interactive visualizations of probability distributions, gradient flow, and its connection to maximum likelihood estimation.

Dilated Convolutions: Expanding Receptive Fields Efficiently/concepts/deep-learning/dilated-convolutions

Master dilated (atrous) convolutions through interactive visualizations of dilation rates, receptive field expansion, gridding artifacts, and applications in segmentation.

Feature Pyramid Networks: Multi-Scale Feature Fusion/concepts/deep-learning/feature-pyramid-networks

Understand Feature Pyramid Networks (FPN) through interactive visualizations of top-down pathways, lateral connections, and multi-scale object detection.

Receptive Field: Understanding CNN Vision/concepts/deep-learning/receptive-field

Explore how receptive fields grow through CNN layers with interactive visualizations of effective vs theoretical fields, architecture comparisons, and pixel contributions.

VAE Latent Space: Understanding Variational Autoencoders/concepts/deep-learning/vae-latent-space

Explore the latent space of Variational Autoencoders through interactive visualizations of encoding, decoding, interpolation, and the reparameterization trick.

Virtual Memory & TLB: Complete Guide to Address Translation/concepts/memory/virtual-memory

Comprehensive guide to virtual memory and TLB with interactive visualizations. Explore page tables, address translation, TLB mechanics, page faults, and performance optimization.

CPU Pipelines & Branch Prediction: Modern Processor Architecture/concepts/performance/cpu-pipelines

Explore CPU pipeline stages, instruction-level parallelism, pipeline hazards, and branch prediction through interactive visualizations.

Hazard Detection: Pipeline Dependencies and Solutions/concepts/performance/hazard-detection

Master pipeline hazards through interactive visualizations of data dependencies, control hazards, structural conflicts, and advanced detection mechanisms.

CPU Cache Lines: The Unit of Memory Transfer/concepts/memory/cpu-cache-lines

Explore how CPU cache lines work, understand spatial locality, and see why memory access patterns dramatically impact performance through interactive visualizations.

Contrastive Loss Functions/concepts/losses/contrastive-loss

Master contrastive loss functions including InfoNCE, NT-Xent, and Triplet Loss for representation learning and self-supervised training.

Focal Loss for Imbalanced Classification/concepts/losses/focal-loss

Master focal loss, the game-changing loss function that addresses extreme class imbalance by down-weighting easy examples and focusing on hard negatives.

KL Divergence/concepts/losses/kl-divergence

Understand Kullback-Leibler divergence, the fundamental measure of difference between probability distributions used in VAEs, information theory, and model compression.

Dropout Regularization/concepts/regularization/dropout

Master dropout, the powerful regularization technique that prevents overfitting by randomly deactivating neurons during training, creating an ensemble of sub-networks.

Context Windows: The Memory Limits of LLMs/concepts/llms/context-windows

Interactive visualization of context window mechanisms in LLMs - sliding windows, expanding contexts, and attention patterns that define what models can "remember".

Flash Attention: IO-Aware Exact Attention/concepts/llms/flash-attention

Interactive visualization of Flash Attention - the breakthrough algorithm that makes attention memory-efficient through tiling, recomputation, and kernel fusion.

KV Cache: The Secret to Fast LLM Inference/concepts/llms/kv-cache

Interactive visualization of key-value caching in LLMs - how caching transformer attention states enables efficient text generation without quadratic recomputation.

Linear Algebra Fundamentals/concepts/math-for-ml/linear-algebra

Essential linear algebra concepts for machine learning with interactive visualizations

Memory Access Patterns: Sequential vs Strided/concepts/memory/memory-access-patterns

Understand how different memory access patterns impact cache performance, prefetcher efficiency, and overall application speed through interactive visualizations.

Memory Interleaving: Parallel Memory Access/concepts/memory/memory-interleaving

Understand how memory interleaving distributes addresses across multiple banks to enable parallel access, dramatically improving memory bandwidth in modern systems from DDR5 to GPU memory.

NUMA Architecture: Non-Uniform Memory Access/concepts/memory/numa-architecture

Explore NUMA (Non-Uniform Memory Access) architecture, understanding how modern multi-socket systems manage memory locality and the performance implications of local vs remote memory access.

Understanding NVIDIA Kubernetes GPU Operator/concepts/gpu/kubernetes-operator

Explore how the NVIDIA GPU Operator automates GPU infrastructure management in Kubernetes, transforming manual GPU setup into a declarative, cloud-native system.

Understanding CUDA Contexts/concepts/gpu/cuda-context

Explore the concept of CUDA contexts, their role in managing GPU resources, and how they enable parallel execution across multiple CPU threads.

CLS Token in Vision Transformers/concepts/attention/cls-token

Learn how the CLS token acts as a global information aggregator in Vision Transformers, enabling whole-image classification through attention mechanisms.

Hierarchical Attention in Vision Transformers/concepts/attention/hierarchical-attention

Explore how hierarchical attention enables Vision Transformers (ViT) to process sequential data by encoding relative positions.

Multi-Head Attention in Vision Transformers/concepts/attention/multihead-attention

Explore how multi-head attention enables Vision Transformers (ViT) to process sequential data by encoding relative positions.

Positional Embeddings in Vision Transformers/concepts/attention/positional-embeddings-vit

Explore how positional embeddings enable Vision Transformers (ViT) to process sequential data by encoding relative positions.

Interactive Look: Self-Attention in Vision Transformers/concepts/attention/self-attention-vit

Interactively explore how self-attention allows Vision Transformers (ViT) to understand images by capturing global context. Click, explore, and see how it differs from CNNs.

Transparent Huge Pages (THP): Reducing TLB Pressure/concepts/memory/transparent-huge-pages

Deep dive into Transparent Huge Pages (THP), a Linux kernel feature that automatically promotes 4KB pages to 2MB huge pages. Learn how THP reduces TLB misses, page table overhead, and improves performance—plus the hidden costs of memory bloat and latency spikes.

ALiBi: Attention with Linear Biases/concepts/attention/alibi

Understand ALiBi, the position encoding method that adds linear biases to attention scores, enabling exceptional length extrapolation without position embeddings.

MHA vs GQA vs MQA: Choosing the Right Attention/concepts/attention/attention-comparison

Compare Multi-Head, Grouped-Query, and Multi-Query Attention mechanisms to understand their trade-offs and choose the optimal approach for your use case.

Attention Sinks: Stable Streaming LLMs/concepts/attention/attention-sinks

Understand attention sinks, the phenomenon where LLMs concentrate attention on initial tokens, and how preserving them enables infinite-length streaming inference.

Cross-Attention: Bridging Different Modalities/concepts/attention/cross-attention

Understand cross-attention, the mechanism that enables transformers to align and fuse information from different sources, sequences, or modalities.

Grouped-Query Attention (GQA)/concepts/attention/grouped-query-attention

Learn how Grouped-Query Attention balances the quality of Multi-Head Attention with the efficiency of Multi-Query Attention, enabling faster inference in large language models.

Linear Attention Approximations/concepts/attention/linear-attention-approximations

Explore linear complexity attention mechanisms including Performer, Linformer, and other efficient transformers that scale to very long sequences.

Masked and Causal Attention/concepts/attention/masked-attention

Learn how masked attention enables autoregressive generation and prevents information leakage in transformers, essential for language models and sequential generation.

Multi-Query Attention (MQA)/concepts/attention/multi-query-attention

Understand Multi-Query Attention, the radical efficiency optimization that shares keys and values across all attention heads, enabling massive memory savings for inference.

Rotary Position Embeddings (RoPE)/concepts/attention/rotary-position-embeddings

Understand Rotary Position Embeddings, the elegant position encoding method that encodes relative positions through rotation matrices, used in LLaMA, GPT-NeoX, and most modern LLMs.

Scaled Dot-Product Attention/concepts/attention/scaled-dot-product

Master the fundamental building block of transformers - scaled dot-product attention. Learn why scaling is crucial and how the mechanism enables parallel computation.

Sliding Window Attention/concepts/attention/sliding-window-attention

Learn how Sliding Window Attention enables efficient processing of long sequences by limiting attention to local context windows, used in Mistral and Longformer.

Sparse Attention Patterns/concepts/attention/sparse-attention-patterns

Explore sparse attention mechanisms that reduce quadratic complexity to linear or sub-quadratic, enabling efficient processing of long sequences.

He/Kaiming Initialization/concepts/deep-learning/he-initialization

Master He (Kaiming) initialization, the optimal weight initialization technique for ReLU networks that prevents gradient vanishing in deep neural architectures.

Xavier/Glorot Initialization/concepts/deep-learning/xavier-initialization

Understand Xavier (Glorot) initialization, the weight initialization technique that maintains signal variance across layers for stable deep network training.

MSE and MAE Loss Functions/concepts/losses/mse-mae

Understand Mean Squared Error (MSE) and Mean Absolute Error (MAE), the fundamental loss functions for regression tasks with different sensitivity to outliers.

SoA vs AoS: Data Layout Optimization/concepts/performance/soa-vs-aos

Master Structure of Arrays (SoA) vs Array of Structures (AoS) data layouts for optimal cache efficiency, SIMD vectorization, and GPU memory coalescing with interactive visualizations.

Understanding NVIDIA Persistence Daemon/concepts/gpu/nvidia-persistence-daemon

Eliminating GPU initialization latency through nvidia-persistenced - a userspace daemon that maintains GPU driver state for optimal startup performance.

ANN Algorithms Comparison/concepts/embeddings/ann-comparison

Compare all approximate nearest neighbor algorithms side-by-side: HNSW, IVF-PQ, LSH, Annoy, and ScaNN. Find the best approach for your use case.

HNSW: Hierarchical Navigable Small World/concepts/embeddings/hnsw-search

Interactive visualization of HNSW - the graph-based algorithm that powers modern vector search with logarithmic complexity.

Vector Index Structures/concepts/embeddings/index-structures

Explore the fundamental data structures powering vector databases: trees, graphs, hash tables, and hybrid approaches for efficient similarity search.

IVF-PQ: Inverted File with Product Quantization/concepts/embeddings/ivf-pq

Learn how IVF-PQ combines clustering and compression to enable billion-scale vector search with minimal memory footprint.

LSH: Locality Sensitive Hashing/concepts/embeddings/lsh-search

Explore how LSH uses probabilistic hash functions to find similar vectors in sub-linear time, perfect for streaming and high-dimensional data.

Vector Quantization Techniques/concepts/embeddings/vector-quantization

Master vector compression techniques from scalar to product quantization. Learn how to reduce memory usage by 10-100× while preserving search quality.

Adaptive Tiling: Efficient Visual Token Generation/concepts/deep-learning/adaptive-tiling

Understanding adaptive tiling in vision transformers - a technique that dynamically adjusts image partitioning based on complexity to optimize token usage while preserving detail.

Emergent Abilities: When AI Suddenly "Gets It"/concepts/deep-learning/emergent-abilities

Understanding emergent abilities in large language models - sudden capabilities that appear at scale thresholds, from arithmetic to reasoning and self-reflection.

Prompt Engineering: Guiding AI Through Language/concepts/deep-learning/prompt-engineering

Master the art of prompt engineering - from basic composition to advanced techniques like Chain-of-Thought and Tree-of-Thoughts.

Prompt Influence Flow: How Instructions Propagate Through Model Layers/concepts/deep-learning/prompt-influence-flow

Deep dive into how different prompt components influence model behavior across transformer layers, from surface patterns to abstract reasoning.

Neural Scaling Laws: The Mathematics of Model Performance/concepts/deep-learning/scaling-laws

Understanding neural scaling laws - the power law relationships between model size, data, compute, and performance that govern AI capabilities and guide development decisions.

Visual Complexity Analysis: Smart Image Processing/concepts/deep-learning/visual-complexity-analysis

Understanding how AI models analyze visual complexity to optimize processing - measuring entropy, edge density, saliency, and texture for intelligent resource allocation.

Cross-Encoder vs Bi-Encoder/concepts/embeddings/cross-encoder-vs-bi-encoder

Understand the fundamental differences between independent and joint encoding architectures for neural retrieval systems.

Dense Embeddings Space Explorer/concepts/embeddings/dense-embeddings

Interactive visualization of high-dimensional vector spaces, word relationships, and semantic arithmetic operations.

Matryoshka Embeddings/concepts/embeddings/matryoshka-embeddings

Learn about nested representations that enable flexible dimension reduction without retraining models.

Multi-Vector Late Interaction/concepts/embeddings/multi-vector-late-interaction

Explore ColBERT and other multi-vector retrieval models that use fine-grained token-level matching for superior search quality.

Quantization Effects Simulator/concepts/embeddings/quantization-effects

Explore memory-accuracy trade-offs in embedding quantization from float32 to binary representations.

Sparse vs Dense Embeddings/concepts/embeddings/sparse-vs-dense

Compare lexical (BM25/TF-IDF) and semantic (BERT) retrieval approaches, understanding their trade-offs and hybrid strategies.

Tokenization: Converting Text to Numbers/concepts/llms/tokenization

Interactive exploration of tokenization methods in LLMs - BPE, SentencePiece, and WordPiece. Understand how text becomes tokens that models can process.

The Vision-Language Alignment Problem/concepts/multimodal/alignment-problem

Exploring the challenge of aligning visual and textual representations in multimodal AI systems.

The Modality Gap/concepts/multimodal/modality-gap

Understanding the fundamental separation between visual and textual representations in multimodal models.

Multimodal Scaling Laws/concepts/multimodal/scaling-laws

Understanding how vision-language models scale with data, parameters, and compute following empirical power laws.

Vision-Language Adapters: Parameter-Efficient Multimodal Fine-tuning/concepts/multimodal/vision-language-adapters

Exploring LoRA, adapters, and other parameter-efficient methods for fine-tuning large vision-language models.

Client-Server Communication: Polling vs WebSockets/concepts/networking/client-server-communication

Understanding different client-server communication patterns - from simple polling to real-time WebSocket connections.

Long Polling: The Patient Connection/concepts/networking/long-polling

Understanding long polling - an efficient approach where the server holds requests open until data is available.

Protocol Comparison: Choosing the Right Communication Pattern/concepts/networking/protocol-comparison

Side-by-side comparison of Short Polling, Long Polling, and WebSockets to help you choose the right protocol for your application.

Short Polling: The Impatient Client Pattern/concepts/networking/short-polling

Understanding short polling - a simple but inefficient approach to fetching data at regular intervals.

WebSockets: Real-Time Bidirectional Communication/concepts/networking/websocket

Understanding WebSockets - the protocol that enables full-duplex communication channels over a single TCP connection.

C++ AST & Parsing/concepts/cpp/ast-parsing

Explore how C++ code is parsed into an Abstract Syntax Tree with interactive visualizations.

C++ Compilation Overview/concepts/cpp/compilation

Understand the complete C++ compilation pipeline from source code to object files.

Design Patterns in C++/concepts/cpp/design-patterns

Learn classic design patterns implemented in modern C++. Explore Singleton, Observer, Factory, and Strategy patterns with interactive examples.

C++ Dynamic Linking/concepts/cpp/dynamic-linking

Master dynamic linking and runtime library loading with interactive visualizations.

C++ Linking Overview/concepts/cpp/linking

Understand how object files are linked together to create executables.

C++ Program Loading/concepts/cpp/loading

Understand how C++ programs are loaded and executed by the operating system.

Memory Management & RAII in C++/concepts/cpp/memory-raii

Learn Resource Acquisition Is Initialization (RAII) - the cornerstone of C++ memory management. Understand automatic resource cleanup and exception safety.

Modern C++ Features (C++11 and Beyond)/concepts/cpp/modern-cpp-features

Explore modern C++ features including auto, lambdas, ranges, and coroutines. Learn how C++11/14/17/20 transformed the language.

Object-Oriented Programming in C++/concepts/cpp/oop-inheritance

Master C++ OOP concepts including inheritance, polymorphism, virtual functions, and modern object-oriented design principles with interactive examples.

C++ Compiler Optimization/concepts/cpp/optimization

Discover how compilers optimize your C++ code through various transformation techniques with interactive demos.

Pointers & References in C++/concepts/cpp/pointers-references

Master C++ pointers and references through interactive visualizations. Learn memory addressing, dereferencing, smart pointers, and avoid common pitfalls.

C++ Preprocessor/concepts/cpp/preprocessor

Master the C++ preprocessor with interactive visualizations of macros, includes, and conditional compilation.

Smart Pointers in Modern C++/concepts/cpp/smart-pointers

Master C++11 smart pointers through interactive examples. Learn unique_ptr, shared_ptr, and weak_ptr with reference counting visualizations.

C++ Stack vs Heap/concepts/cpp/stack-heap

Understand stack and heap memory allocation with interactive visualizations.

C++ Symbol Resolution/concepts/cpp/symbol-resolution

Learn how the linker resolves symbols and fixes undefined references with interactive visualizations.

Templates & STL in C++/concepts/cpp/templates-stl

Master C++ templates and the Standard Template Library. Learn generic programming, template metaprogramming, and STL containers and algorithms.

Gradient Flow in Deep Networks/concepts/deep-learning/gradient-flow

Understanding how gradients propagate through deep neural networks and the vanishing/exploding gradient problems.

NCCL: High-Performance Multi-GPU Communication/concepts/gpu/nccl-communication

Understanding NVIDIA's Collective Communications Library for distributed deep learning and multi-GPU training

Eigenvalues & Eigenvectors/concepts/math-for-ml/eigenvalues-eigenvectors

Visualize eigenvalues and eigenvectors - key concepts for PCA, spectral methods, and matrix analysis.

Gradient Descent/concepts/math-for-ml/gradient-descent

Visualize gradient descent optimization - how neural networks learn by following gradients.

Vectors & Matrices/concepts/math-for-ml/vectors-matrices

Understand vectors and matrices - the fundamental data structures in machine learning.

Visual Complexity Analysis for Token Allocation/concepts/computer-vision/visual-complexity-analysis

Advanced framework for intelligent token allocation in vision transformers based on visual complexity metrics

Tensor Cores: Accelerating Deep Learning/concepts/gpu/tensor-cores

Understanding NVIDIA's specialized matrix multiplication hardware for AI workloads

Layer Normalization/concepts/deep-learning/layer-normalization

Understanding layer normalization technique that normalizes inputs across features, making it ideal for sequence models and transformers.

Internal Covariate Shift/concepts/deep-learning/internal-covariate-shift

Understanding the distribution shift problem in deep neural networks that batch normalization solves.

Batch Normalization/concepts/deep-learning/batch-normalization

Understanding batch normalization technique that normalizes inputs to accelerate training and improve neural network performance.

Skip Connections/concepts/deep-learning/skip-connections

Understanding skip connections, residual blocks, and their crucial role in training deep neural networks.

C++ Virtual Tables & Inheritance/concepts/cpp/virtual-tables-inheritance

Deep dive into C++ virtual tables (vtables), virtual dispatch mechanism, inheritance types, and object memory layout

CPU Performance & Optimization/concepts/performance/cpu-optimization

Understanding CPU cycles, memory hierarchy, cache optimization, and performance analysis techniques

Graph Attention Networks (GAT)/concepts/graph/graph-attention-networks

Adaptive attention-based aggregation for graph neural networks - multi-head attention, learned weights, and interpretable graph learning

Graph Centrality & Metrics/concepts/graph/graph-centrality

Understanding node importance through centrality measures, shortest paths, hop distances, clustering coefficients, and fundamental graph metrics

Graph Convolutional Networks (GCN)/concepts/graph/graph-convolutional-networks

Deep dive into Graph Convolutional Networks - spectral graph theory, message passing, aggregation mechanisms, and applications in node classification and graph learning

Graph Embeddings/concepts/graph/graph-embeddings

Learning low-dimensional vector representations of graphs through random walks, DeepWalk, Node2Vec, and skip-gram models

Graph Pooling Methods/concepts/graph/graph-pooling

Hierarchical graph coarsening techniques - TopK, SAGPool, DiffPool, and readout operations for graph-level representations

Mixture of Experts (MoE)/concepts/llm/mixture-of-experts

Understanding sparse mixture of experts models - architecture, routing mechanisms, load balancing, and efficient scaling strategies for large language models

GPU Streaming Multiprocessor (SM)/concepts/gpu/shared-multiprocessor

Deep dive into the fundamental processing unit of modern GPUs - the Streaming Multiprocessor architecture, execution model, and memory hierarchy

Uses/uses

Tools, software, and hardware I use

Resume/resume

My professional experience and qualifications

Bookmarks/bookmarks

A curated collection of articles and resources I find valuable

Consulting/consulting

Services and consulting offerings

Thank You/thank-you

Confirmation page after form submissions

Sitemap/sitemap

Visual representation of the site structure