Sitemap

A visual representation of the site structure to help you navigate through the content.

Site Structure

Main landing page with introduction and recent articles

About/about

Learn more about me, my background, and expertise

Speaking/speaking

My talks, presentations, and speaking engagements

Articles/articles

Collection of articles I've written on various topics

H264 implementation applications/articles/h264-implementation-applications

Article content

H264 transform quantization/articles/h264-transform-quantization

Article content

H264 fundamentals/articles/h264-fundamentals

Article content

Zettel/articles/zettel

Article content

Compiling pytorch kernel/articles/compiling-pytorch-kernel

Article content

View size not compatible/articles/view-size-not-compatible

Article content

Gpu boot errors/articles/gpu-boot-errors

Article content

H264 interactive guide/articles/h264-interactive-guide

Article content

Ggml structure/articles/ggml-structure

Article content

Quantization deep dive/articles/quantization-deep-dive

Article content

How tensorrt works/articles/how-tensorrt-works

Article content

Kernel fusion/articles/kernel-fusion

Article content

Visualizing yolov5/articles/visualizing-yolov5

Article content

Cpython internals/articles/cpython-internals

Article content

Cpp compilation process/articles/cpp-compilation-process

Article content

Cpp loading runtime/articles/cpp-loading-runtime

Article content

Cpp linking in depth/articles/cpp-linking-in-depth

Article content

Registry pattern/articles/registry-pattern

Article content

Magic numbers/articles/magic-numbers

Article content

Image encoding/articles/image-encoding

Article content

Text encoding/articles/text-encoding

Article content

Papers/papers

Research papers and publications

Visual instruction tuning/papers/visual-instruction-tuning

Paper content

Vit object detection/papers/vit-object-detection

Paper content

Yolo/papers/yolo

Paper content

Efficientnet/papers/efficientnet

Paper content

Faster rcnn/papers/faster-rcnn

Paper content

Sam/papers/sam

Paper content

DETR/papers/DETR

Paper content

Blip2/papers/blip2

Paper content

Image worth 16x16/papers/image-worth-16x16

Paper content

Optimizing transformer inference/papers/optimizing-transformer-inference

Paper content

Surf/papers/surf

Paper content

Swin transformer/papers/swin-transformer

Paper content

Clip/papers/clip

Paper content

Deeplearning go brr/papers/deeplearning-go-brr

Paper content

Attention is all you need/papers/attention-is-all-you-need

Paper content

Data movement transformer/papers/data-movement-transformer

Paper content

Deep residual learning/papers/deep-residual-learning

Paper content

Concepts/concepts

Interactive explanations of machine learning concepts

initramfs: The Initial RAM Filesystem Explained/concepts/linux/initramfs-boot-process

Deep dive into initramfs (initial RAM filesystem), understanding the Linux boot process, early userspace, and how the kernel transitions from boot to the real root filesystem.

Linux Kernel Architecture: Core Subsystems Deep Dive/concepts/linux/kernel-architecture

Master the Linux kernel architecture through interactive visualizations. Explore kernel layers, memory management, process scheduling, VFS, and the complete boot process.

Explore the inner workings of RAM through beautiful animations and interactive visualizations. Understand memory cells, addressing, and the memory hierarchy.

Python Bytecode Compilation/concepts/python/bytecode-compilation

How Python compiles source code to bytecode and executes it

High Bandwidth Memory (HBM)/concepts/gpu/hbm-memory

3D-stacked DRAM architecture providing massive bandwidth for GPUs and AI accelerators

GPU Memory Hierarchy & Optimization/concepts/gpu/memory-hierarchy

Master GPU memory hierarchy from registers to global memory, understand coalescing patterns, bank conflicts, and optimization strategies for maximum performance

Filesystems: The Digital DNA of Data Storage/concepts/linux/filesystems-overview

Explore how filesystems work through beautiful interactive visualizations. From the VFS abstraction layer to modern CoW filesystems like Btrfs and ZFS. Understand the magic behind storing and retrieving your data.

Python Memory Management/concepts/python/memory-management

How CPython manages memory with PyMalloc, object pools, and reference counting

CUDA Unified Memory/concepts/gpu/unified-memory

Unified virtual address space enabling seamless CPU-GPU memory sharing with automatic page migration

Discover inodes through interactive visualizations—the invisible data structures that track everything about your files except their names. Learn why running out of inodes means no new files, even with free space!

Global Interpreter Lock (GIL)/concepts/python/global-interpreter-lock

Understanding Python's GIL, its impact on multithreading, and workarounds

Page Migration & Fault Handling/concepts/gpu/page-migration

Understanding virtual memory page migration, fault handling, and TLB management in CPU-GPU systems

FUSE: Filesystem in Userspace Explained/concepts/linux/fuse-filesystem

Understand FUSE (Filesystem in Userspace) - the framework that lets you implement filesystems without writing kernel code. Learn how NTFS, SSHFS, and cloud storage work on Linux.

Python Object Model/concepts/python/object-model

Understanding PyObject, type system, and how Python objects work internally

ext4: The Linux Workhorse Filesystem/concepts/linux/ext4-filesystem

Deep dive into ext4 (fourth extended filesystem) - the default filesystem for most Linux distributions. Learn about journaling, extents, and why ext4 remains the reliable choice.

Python Garbage Collection/concepts/python/garbage-collection

How Python handles memory cleanup with reference counting and cyclic GC

NTFS on Linux: Windows Filesystem Support/concepts/linux/ntfs-filesystem

Understand NTFS (New Technology File System) and how Linux provides support through NTFS-3G FUSE driver. Learn about MFT, alternate data streams, and cross-platform challenges.

CPU Pipeline Architecture/concepts/performance/cpu-pipeline-detailed

Comprehensive exploration of CPU pipeline stages, hazards, superscalar, and out-of-order execution

Python Optimization Techniques/concepts/python/python-optimization

Performance optimization strategies and CPython optimizations

Contrastive Learning/concepts/embeddings/contrastive-learning

Learn representations by pulling similar samples together and pushing dissimilar ones apart

Btrfs: Modern Copy-on-Write Filesystem/concepts/linux/btrfs-filesystem

Explore Btrfs (B-tree filesystem) - the modern Linux filesystem with built-in snapshots, RAID, compression, and advanced features for data integrity and flexibility.

Cross-Lingual Alignment/concepts/embeddings/cross-lingual-alignment

Align embeddings across languages for multilingual understanding

ZFS: The Ultimate Filesystem/concepts/linux/zfs-filesystem

Deep dive into ZFS (Zettabyte File System) - the most advanced filesystem with unmatched data integrity, pooled storage, snapshots, and enterprise features.

Green Threads vs OS Threads: Understanding Concurrency Models/concepts/python/green-threads-vs-os-threads

Deep dive into the differences between green threads (user-space threads) and OS threads (kernel threads), with interactive visualizations showing scheduling, context switching, and performance implications.

Domain Adaptation/concepts/embeddings/domain-adaptation

Adapt embeddings from source to target domains while preserving knowledge

XFS: High-Performance Filesystem/concepts/linux/xfs-filesystem

Explore XFS - the high-performance 64-bit journaling filesystem optimized for large files and parallel I/O. Learn why it excels at handling massive data workloads.

Python asyncio: Mastering Asynchronous Programming/concepts/python/asyncio-event-loop

Deep dive into Python's asyncio library, understanding event loops, coroutines, tasks, and async/await patterns with interactive visualizations.

Binary Embeddings/concepts/embeddings/binary-embeddings

Ultra-compact 1-bit representations for massive-scale retrieval

FAT32 & exFAT: Universal Filesystems/concepts/linux/fat-filesystems

Understand FAT32 and exFAT - the universal filesystems for cross-platform compatibility. Learn their limitations, use cases, and why they remain essential for removable media.

TLB: How CPUs Translate Virtual to Physical Memory/concepts/memory/tlb-translation-lookaside-buffer

Deep dive into Translation Lookaside Buffers - the critical cache that makes virtual memory fast. Interactive visualizations of address translation, page walks, and TLB management.

Hybrid Retrieval Systems/concepts/embeddings/hybrid-retrieval-systems

Combining sparse and dense retrieval for optimal search performance

Master RAID storage through interactive visualizations. Understand RAID 0, 1, 5, 6, and 10 - how they work, when to use them, and what happens during disk failures.

Memory Controllers: The Brain Behind RAM Management/concepts/memory/memory-controllers

Explore how memory controllers orchestrate data flow between CPU and RAM. Interactive visualizations of channels, ranks, banks, and the complex scheduling that maximizes memory bandwidth.

BM25 Algorithm/concepts/embeddings/bm25-algorithm

Probabilistic ranking function for information retrieval with term frequency saturation

Linux Process Management: Fork, Exec, and Beyond/concepts/linux/process-management

Master Linux process management through interactive visualizations. Understand process lifecycle, fork/exec operations, zombies, orphans, and CPU scheduling.

Explore Linux memory management through interactive visualizations. Understand virtual memory, page tables, TLB, swapping, and memory allocation.

Understand Linux system calls through interactive visualizations. Learn how user programs communicate with the kernel, protection rings, and syscall performance.

Master the Linux networking stack through interactive visualizations. Understand TCP/IP layers, sockets, iptables, routing, and network namespaces.

Linux Boot Process: From Power-On to Login/concepts/linux/boot-process

Understand the Linux boot process through interactive visualizations. Learn about BIOS/UEFI, bootloaders, kernel initialization, and the journey to userspace.

Linux Init Systems: From SysV to systemd/concepts/linux/init-systems

Compare Linux init systems through interactive visualizations. Understand the evolution from SysV Init to systemd, service management, and boot orchestration.

Master Linux kernel modules through interactive visualizations. Learn how to load, unload, develop, and debug kernel modules that extend Linux functionality.

__slots__ Optimization/concepts/python/slots-optimization

Understanding Python's __slots__ for memory optimization and faster attribute access

Calculus for Machine Learning/concepts/math-for-ml/calculus-basics

Essential calculus concepts for understanding gradients, optimization, and backpropagation

Flynn's Classification: Taxonomy of Computer Architectures/concepts/performance/flynns-classification

Explore Flynn's Classification of computer architectures through interactive visualizations of SISD, SIMD, MISD, and MIMD systems.

Master thread safety concepts through interactive visualizations of race conditions, mutexes, atomic operations, and deadlock scenarios.

Binary Search Trees: Self-Balancing Data Structures/concepts/data-structures/binary-search-trees

Master binary search trees through interactive visualizations of insertions, deletions, rotations, and self-balancing algorithms like AVL and Red-Black trees.

Hash Tables: Fast Lookups with Collision Resolution/concepts/data-structures/hash-tables

Master hash tables through interactive visualizations of hash functions, collision resolution strategies, load factors, and performance characteristics.

Convolution Operation: The Foundation of CNNs/concepts/deep-learning/convolution-operation

Master the convolution operation through interactive visualizations of sliding windows, feature detection, and the mathematical mechanics behind convolutional neural networks.

Cross-Entropy Loss: The Foundation of Classification/concepts/deep-learning/cross-entropy-loss

Understand cross-entropy loss through interactive visualizations of probability distributions, gradient flow, and its connection to maximum likelihood estimation.

Dilated Convolutions: Expanding Receptive Fields Efficiently/concepts/deep-learning/dilated-convolutions

Master dilated (atrous) convolutions through interactive visualizations of dilation rates, receptive field expansion, gridding artifacts, and applications in segmentation.

Feature Pyramid Networks: Multi-Scale Feature Fusion/concepts/deep-learning/feature-pyramid-networks

Understand Feature Pyramid Networks (FPN) through interactive visualizations of top-down pathways, lateral connections, and multi-scale object detection.

Receptive Field: Understanding CNN Vision/concepts/deep-learning/receptive-field

Explore how receptive fields grow through CNN layers with interactive visualizations of effective vs theoretical fields, architecture comparisons, and pixel contributions.

VAE Latent Space: Understanding Variational Autoencoders/concepts/deep-learning/vae-latent-space

Explore the latent space of Variational Autoencoders through interactive visualizations of encoding, decoding, interpolation, and the reparameterization trick.

Explore virtual memory management through interactive visualizations of page tables, TLB operations, page faults, and memory mapping.

Explore CPU pipeline stages, instruction-level parallelism, pipeline hazards, and branch prediction through interactive visualizations.

Hazard Detection: Pipeline Dependencies and Solutions/concepts/performance/hazard-detection

Master pipeline hazards through interactive visualizations of data dependencies, control hazards, structural conflicts, and advanced detection mechanisms.

CPU Cache Lines: The Unit of Memory Transfer/concepts/memory/cpu-cache-lines

Explore how CPU cache lines work, understand spatial locality, and see why memory access patterns dramatically impact performance through interactive visualizations.

Memory Access Patterns: Sequential vs Strided/concepts/memory/memory-access-patterns

Understand how different memory access patterns impact cache performance, prefetcher efficiency, and overall application speed through interactive visualizations.

NUMA Architecture: Non-Uniform Memory Access/concepts/memory/numa-architecture

Explore NUMA (Non-Uniform Memory Access) architecture, understanding how modern multi-socket systems manage memory locality and the performance implications of local vs remote memory access.

Understanding CUDA Contexts/concepts/gpu/cuda-context

Explore the concept of CUDA contexts, their role in managing GPU resources, and how they enable parallel execution across multiple CPU threads.

CLS Token in Vision Transformers/concepts/attention/cls-token

Learn how the CLS token acts as a global information aggregator in Vision Transformers, enabling whole-image classification through attention mechanisms.

Hierarchical Attention in Vision Transformers/concepts/attention/hierarchical-attention

Explore how hierarchical attention enables Vision Transformers (ViT) to process sequential data by encoding relative positions.

Multi-Head Attention in Vision Transformers/concepts/attention/multihead-attention

Explore how multi-head attention enables Vision Transformers (ViT) to process sequential data by encoding relative positions.

Positional Embeddings in Vision Transformers/concepts/attention/positional-embeddings-vit

Explore how positional embeddings enable Vision Transformers (ViT) to process sequential data by encoding relative positions.

Interactive Look: Self-Attention in Vision Transformers/concepts/attention/self-attention-vit

Interactively explore how self-attention allows Vision Transformers (ViT) to understand images by capturing global context. Click, explore, and see how it differs from CNNs.

Cross-Attention: Bridging Different Modalities/concepts/attention/cross-attention

Understand cross-attention, the mechanism that enables transformers to align and fuse information from different sources, sequences, or modalities.

Masked and Causal Attention/concepts/attention/masked-attention

Learn how masked attention enables autoregressive generation and prevents information leakage in transformers, essential for language models and sequential generation.

Scaled Dot-Product Attention/concepts/attention/scaled-dot-product

Master the fundamental building block of transformers - scaled dot-product attention. Learn why scaling is crucial and how the mechanism enables parallel computation.

ANN Algorithms Comparison/concepts/embeddings/ann-comparison

Compare all approximate nearest neighbor algorithms side-by-side: HNSW, IVF-PQ, LSH, Annoy, and ScaNN. Find the best approach for your use case.

HNSW: Hierarchical Navigable Small World/concepts/embeddings/hnsw-search

Interactive visualization of HNSW - the graph-based algorithm that powers modern vector search with logarithmic complexity.

Vector Index Structures/concepts/embeddings/index-structures

Explore the fundamental data structures powering vector databases: trees, graphs, hash tables, and hybrid approaches for efficient similarity search.

Learn how IVF-PQ combines clustering and compression to enable billion-scale vector search with minimal memory footprint.

LSH: Locality Sensitive Hashing/concepts/embeddings/lsh-search

Explore how LSH uses probabilistic hash functions to find similar vectors in sub-linear time, perfect for streaming and high-dimensional data.

Vector Quantization Techniques/concepts/embeddings/vector-quantization

Master vector compression techniques from scalar to product quantization. Learn how to reduce memory usage by 10-100× while preserving search quality.

Adaptive Tiling: Efficient Visual Token Generation/concepts/deep-learning/adaptive-tiling

Understanding adaptive tiling in vision transformers - a technique that dynamically adjusts image partitioning based on complexity to optimize token usage while preserving detail.

Emergent Abilities: When AI Suddenly "Gets It"/concepts/deep-learning/emergent-abilities

Understanding emergent abilities in large language models - sudden capabilities that appear at scale thresholds, from arithmetic to reasoning and self-reflection.

Prompt Engineering: Guiding AI Through Language/concepts/deep-learning/prompt-engineering

Master the art of prompt engineering - from basic composition to advanced techniques like Chain-of-Thought and Tree-of-Thoughts.

Deep dive into how different prompt components influence model behavior across transformer layers, from surface patterns to abstract reasoning.

Understanding neural scaling laws - the power law relationships between model size, data, compute, and performance that govern AI capabilities and guide development decisions.

Visual Complexity Analysis: Smart Image Processing/concepts/deep-learning/visual-complexity-analysis

Understanding how AI models analyze visual complexity to optimize processing - measuring entropy, edge density, saliency, and texture for intelligent resource allocation.

Cross-Encoder vs Bi-Encoder/concepts/embeddings/cross-encoder-vs-bi-encoder

Understand the fundamental differences between independent and joint encoding architectures for neural retrieval systems.

Dense Embeddings Space Explorer/concepts/embeddings/dense-embeddings

Interactive visualization of high-dimensional vector spaces, word relationships, and semantic arithmetic operations.

Matryoshka Embeddings/concepts/embeddings/matryoshka-embeddings

Learn about nested representations that enable flexible dimension reduction without retraining models.

Multi-Vector Late Interaction/concepts/embeddings/multi-vector-late-interaction

Explore ColBERT and other multi-vector retrieval models that use fine-grained token-level matching for superior search quality.

Quantization Effects Simulator/concepts/embeddings/quantization-effects

Explore memory-accuracy trade-offs in embedding quantization from float32 to binary representations.

Sparse vs Dense Embeddings/concepts/embeddings/sparse-vs-dense

Compare lexical (BM25/TF-IDF) and semantic (BERT) retrieval approaches, understanding their trade-offs and hybrid strategies.

Context Windows: The Memory Limits of LLMs/concepts/llms/context-windows

Interactive visualization of context window mechanisms in LLMs - sliding windows, expanding contexts, and attention patterns that define what models can "remember".

Flash Attention: IO-Aware Exact Attention/concepts/llms/flash-attention

Interactive visualization of Flash Attention - the breakthrough algorithm that makes attention memory-efficient through tiling, recomputation, and kernel fusion.

Interactive visualization of key-value caching in LLMs - how caching transformer attention states enables efficient text generation without quadratic recomputation.

Tokenization: Converting Text to Numbers/concepts/llms/tokenization

Interactive exploration of tokenization methods in LLMs - BPE, SentencePiece, and WordPiece. Understand how text becomes tokens that models can process.

Exploring LoRA, adapters, and other parameter-efficient methods for fine-tuning large vision-language models.

Long Polling: The Patient Connection/concepts/networking/long-polling

Understanding long polling - an efficient approach where the server holds requests open until data is available.

Short Polling: The Impatient Client Pattern/concepts/networking/short-polling

Understanding short polling - a simple but inefficient approach to fetching data at regular intervals.

Eigenvalues & Eigenvectors/concepts/math-for-ml/eigenvalues-eigenvectors

Visualize eigenvalues and eigenvectors - key concepts for PCA, spectral methods, and matrix analysis.

Gradient Descent/concepts/math-for-ml/gradient-descent

Visualize gradient descent optimization - how neural networks learn by following gradients.

Linear Algebra Fundamentals/concepts/math-for-ml/linear-algebra

Essential linear algebra concepts for machine learning with interactive visualizations

Memory Interleaving: Parallel Memory Access/concepts/memory/memory-interleaving

Understand how memory interleaving distributes addresses across multiple banks to enable parallel access, dramatically improving memory bandwidth in modern systems from DDR5 to GPU memory.

The Vision-Language Alignment Problem/concepts/multimodal/alignment-problem

Exploring the challenge of aligning visual and textual representations in multimodal AI systems.

The Modality Gap/concepts/multimodal/modality-gap

Understanding the fundamental separation between visual and textual representations in multimodal models.

Multimodal Scaling Laws/concepts/multimodal/scaling-laws

Understanding how vision-language models scale with data, parameters, and compute following empirical power laws.

Client-Server Communication: Polling vs WebSockets/concepts/networking/client-server-communication

Understanding different client-server communication patterns - from simple polling to real-time WebSocket connections.

Side-by-side comparison of Short Polling, Long Polling, and WebSockets to help you choose the right protocol for your application.

Understanding WebSockets - the protocol that enables full-duplex communication channels over a single TCP connection.

C++ AST & Parsing/concepts/cpp/ast-parsing

Explore how C++ code is parsed into an Abstract Syntax Tree with interactive visualizations.

C++ Compilation Overview/concepts/cpp/compilation

Understand the complete C++ compilation pipeline from source code to object files.

Design Patterns in C++/concepts/cpp/design-patterns

Learn classic design patterns implemented in modern C++. Explore Singleton, Observer, Factory, and Strategy patterns with interactive examples.

C++ Dynamic Linking/concepts/cpp/dynamic-linking

Master dynamic linking and runtime library loading with interactive visualizations.

C++ Linking Overview/concepts/cpp/linking

Understand how object files are linked together to create executables.

C++ Program Loading/concepts/cpp/loading

Understand how C++ programs are loaded and executed by the operating system.

Memory Management & RAII in C++/concepts/cpp/memory-raii

Learn Resource Acquisition Is Initialization (RAII) - the cornerstone of C++ memory management. Understand automatic resource cleanup and exception safety.

Modern C++ Features (C++11 and Beyond)/concepts/cpp/modern-cpp-features

Explore modern C++ features including auto, lambdas, ranges, and coroutines. Learn how C++11/14/17/20 transformed the language.

Object-Oriented Programming in C++/concepts/cpp/oop-inheritance

Master C++ OOP concepts including inheritance, polymorphism, virtual functions, and modern object-oriented design principles with interactive examples.

C++ Compiler Optimization/concepts/cpp/optimization

Discover how compilers optimize your C++ code through various transformation techniques with interactive demos.

Pointers & References in C++/concepts/cpp/pointers-references

Master C++ pointers and references through interactive visualizations. Learn memory addressing, dereferencing, smart pointers, and avoid common pitfalls.

C++ Preprocessor/concepts/cpp/preprocessor

Master the C++ preprocessor with interactive visualizations of macros, includes, and conditional compilation.

Smart Pointers in Modern C++/concepts/cpp/smart-pointers

Master C++11 smart pointers through interactive examples. Learn unique_ptr, shared_ptr, and weak_ptr with reference counting visualizations.

C++ Stack vs Heap/concepts/cpp/stack-heap

Understand stack and heap memory allocation with interactive visualizations.

Templates & STL in C++/concepts/cpp/templates-stl

Master C++ templates and the Standard Template Library. Learn generic programming, template metaprogramming, and STL containers and algorithms.

C++ Symbol Resolution/concepts/cpp/symbol-resolution

Learn how the linker resolves symbols and fixes undefined references with interactive visualizations.

Gradient Flow in Deep Networks/concepts/deep-learning/gradient-flow

Understanding how gradients propagate through deep neural networks and the vanishing/exploding gradient problems.

Vectors & Matrices/concepts/math-for-ml/vectors-matrices

Understand vectors and matrices - the fundamental data structures in machine learning.

Visual Complexity Analysis for Token Allocation/concepts/computer-vision/visual-complexity-analysis

Advanced framework for intelligent token allocation in vision transformers based on visual complexity metrics

Understanding NVIDIA's specialized matrix multiplication hardware for AI workloads

Layer Normalization/concepts/deep-learning/layer-normalization

Understanding layer normalization technique that normalizes inputs across features, making it ideal for sequence models and transformers.

Internal Covariate Shift/concepts/deep-learning/internal-covariate-shift

Understanding the distribution shift problem in deep neural networks that batch normalization solves.

Batch Normalization/concepts/deep-learning/batch-normalization

Understanding batch normalization technique that normalizes inputs to accelerate training and improve neural network performance.

Skip Connections/concepts/deep-learning/skip-connections

Understanding skip connections, residual blocks, and their crucial role in training deep neural networks.

C++ Virtual Tables & Inheritance/concepts/cpp/virtual-tables-inheritance

Deep dive into C++ virtual tables (vtables), virtual dispatch mechanism, inheritance types, and object memory layout

CPU Performance & Optimization/concepts/performance/cpu-optimization

Understanding CPU cycles, memory hierarchy, cache optimization, and performance analysis techniques

Graph Attention Networks (GAT)/concepts/graph/graph-attention-networks

Adaptive attention-based aggregation for graph neural networks - multi-head attention, learned weights, and interpretable graph learning

Graph Centrality & Metrics/concepts/graph/graph-centrality

Understanding node importance through centrality measures, shortest paths, hop distances, clustering coefficients, and fundamental graph metrics

Graph Convolutional Networks (GCN)/concepts/graph/graph-convolutional-networks

Deep dive into Graph Convolutional Networks - spectral graph theory, message passing, aggregation mechanisms, and applications in node classification and graph learning

Graph Embeddings/concepts/graph/graph-embeddings

Learning low-dimensional vector representations of graphs through random walks, DeepWalk, Node2Vec, and skip-gram models

Graph Pooling Methods/concepts/graph/graph-pooling

Hierarchical graph coarsening techniques - TopK, SAGPool, DiffPool, and readout operations for graph-level representations

Mixture of Experts (MoE)/concepts/llm/mixture-of-experts

Understanding sparse mixture of experts models - architecture, routing mechanisms, load balancing, and efficient scaling strategies for large language models

GPU Streaming Multiprocessor (SM)/concepts/gpu/shared-multiprocessor

Deep dive into the fundamental processing unit of modern GPUs - the Streaming Multiprocessor architecture, execution model, and memory hierarchy

Uses/uses

Tools, software, and hardware I use

Resume/resume

My professional experience and qualifications

Bookmarks/bookmarks

A curated collection of articles and resources I find valuable

Consulting/consulting

Services and consulting offerings

Thank You/thank-you

Confirmation page after form submissions

Sitemap/sitemap

Visual representation of the site structure

Mastodon