HNSW: Hierarchical Navigable Small World

10 min

Interactive visualization of HNSW - the graph-based algorithm that powers modern vector search with logarithmic complexity.

Best viewed on desktop for optimal interactive experience

HNSW: Hierarchical Navigable Small World

HNSW is a state-of-the-art graph-based algorithm for approximate nearest neighbor search that achieves logarithmic search complexity with high recall rates, making it the backbone of many modern vector databases.

Interactive HNSW Explorer

Explore how HNSW builds multi-layer graphs and navigates them efficiently to find nearest neighbors:

HNSW Algorithm Visualization

Hierarchical Navigable Small World - Graph-based vector search

Multi-layer Graph Structure

Layer:
Layer 2 (Sparse)
Layer 1 (Medium)
Layer 0 (Dense)

Layer 2

Nodes:0
Edges:0
Avg Degree:0

Layer 1

Nodes:0
Edges:0
Avg Degree:0

Layer 0

Nodes:0
Edges:0
Avg Degree:0

Why HNSW?

Traditional exact nearest neighbor search has O(N) complexity, making it impractical for large-scale applications. HNSW solves this with:

  • Logarithmic search time: O(log N)
  • High recall: 95-99% accuracy
  • Incremental updates: Add vectors without rebuilding
  • Robust performance: Works well across different data distributions

Core Concepts

1. Multi-Layer Architecture

HNSW constructs a hierarchy of proximity graphs:

P(layer = l) = min(1, e-l · mL)

Where:

  • Higher layers are sparser (fewer nodes)
  • Lower layers are denser (more connections)
  • Entry points start from the top layer

2. Small World Property

The "small world" phenomenon ensures short paths between any two nodes:

davg \propto log N

This is achieved through:

  • Local connections: Connect to nearby neighbors
  • Long-range connections: Occasional distant links
  • Navigable structure: Greedy routing works effectively

3. Construction Algorithm

def insert_node(new_node, M, M_max, ef_construction): # Assign layer based on exponential decay layer = floor(-ln(uniform(0, 1)) * m_L) # Find nearest neighbors at each layer for lc in range(layer, -1, -1): candidates = search_layer(new_node, ef_construction, lc) # Select M neighbors (M_max for layer 0) m = M_max if lc == 0 else M neighbors = select_neighbors_heuristic(candidates, m) # Add bidirectional edges for neighbor in neighbors: add_edge(new_node, neighbor, lc) prune_connections(neighbor, M_max if lc == 0 else M)

4. Search Algorithm

The search proceeds layer by layer, refining candidates:

def search(query, ef, k): # Start from entry point at top layer current_nearest = entry_point # Search from top to layer 1 for layer in range(top_layer, 0, -1): current_nearest = search_layer(query, 1, layer) # Search at layer 0 with ef candidates candidates = search_layer(query, ef, 0) return top_k(candidates, k)

Key Parameters

Construction Parameters

ParameterTypical ValueEffect
M16-32Connections per node
M_maxM×2Max connections for layer 0
ef_construction200Build-time search width
m_L1/ln(2)Layer assignment factor

Search Parameters

ParameterTypical ValueEffect
ef50-500Search-time beam width
k1-100Number of neighbors to return

Performance Characteristics

Complexity Analysis

OperationTime ComplexitySpace Complexity
BuildO(N log N)O(M · N)
SearchO(log N)O(ef)
InsertO(log N)O(M)
DeleteNot supported*-

*Node deletion requires index rebuild

Recall vs Speed Tradeoff

ef value | Recall | QPS | Latency ----------|--------|--------|---------- 50 | 90% | 10,000 | 0.1ms 100 | 95% | 5,000 | 0.2ms 200 | 98% | 2,500 | 0.4ms 500 | 99.5% | 1,000 | 1ms

Implementation in Practice

  1. Faiss (Facebook)

    import faiss index = faiss.IndexHNSWFlat(d, M) index.hnsw.efConstruction = 200 index.add(vectors)
  2. HNSWlib

    import hnswlib index = hnswlib.Index(space='l2', dim=d) index.init_index(max_elements=N, M=16, ef_construction=200) index.add_items(vectors)
  3. Annoy (Spotify)

    from annoy import AnnoyIndex index = AnnoyIndex(d, 'angular') for i, vec in enumerate(vectors): index.add_item(i, vec) index.build(n_trees=10)

Vector Databases Using HNSW

  • Weaviate: Default ANN algorithm
  • Qdrant: One of the supported indexes
  • Milvus: HNSW index option
  • Vespa: HNSW for dense vectors

Optimization Techniques

1. Hierarchical Pruning

Instead of keeping all M neighbors, use a heuristic to maintain diversity:

def prune_connections(node, candidates, M): # Keep closest neighbor pruned = [nearest(candidates)] candidates.remove(nearest) # Add diverse neighbors while len(pruned) < M and candidates: # Select furthest from current set next_node = max(candidates, key=lambda c: min_dist_to_set(c, pruned)) pruned.append(next_node) candidates.remove(next_node) return pruned

2. Prefetching

Improve cache locality by prefetching node data:

void search_with_prefetch(Node* current) { // Prefetch neighbors' data for (auto neighbor : current->neighbors) { __builtin_prefetch(neighbor, 0, 1); } // Process current node while prefetching process_node(current); }

3. SIMD Optimization

Use SIMD instructions for distance calculations:

float euclidean_distance_avx(const float* a, const float* b, int d) { __m256 sum = _mm256_setzero_ps(); for (int i = 0; i < d; i += 8) { __m256 va = _mm256_loadu_ps(a + i); __m256 vb = _mm256_loadu_ps(b + i); __m256 diff = _mm256_sub_ps(va, vb); sum = _mm256_fmadd_ps(diff, diff, sum); } // Horizontal sum return horizontal_sum_avx(sum); }

When to Use HNSW

✅ Use HNSW When:

  • You need high recall (>95%)
  • Dataset size is moderate (millions of vectors)
  • Low query latency is critical
  • You need incremental updates
  • Memory is not severely constrained

❌ Avoid HNSW When:

  • Memory is very limited (consider IVF-PQ)
  • Dataset is small (<10K vectors, use exact search)
  • You need to delete vectors frequently
  • Extremely high dimensionality (>1000d)

Comparison with Other Methods

AlgorithmBuild TimeSearch TimeMemoryRecallUse Case
HNSWSlowVery FastHighExcellentReal-time search
IVFFastFastMediumGoodLarge-scale
LSHVery FastFastLowModerateStreaming
AnnoyMediumFastMediumGoodStatic data
ScaNNSlowFastMediumVery GoodGoogle-scale

Advanced Topics

Combining vector search with metadata filtering:

def filtered_search(query, filter_func, k): candidates = [] visited = set() while len(candidates) < k * 10: # Over-retrieve next_batch = search_layer(query, ef, 0, visited) filtered = [c for c in next_batch if filter_func(c.metadata)] candidates.extend(filtered) visited.update(next_batch) return top_k(candidates, k)

Combining HNSW with keyword search:

def hybrid_search(query_vector, query_text, alpha=0.5): # Vector search using HNSW vector_results = hnsw_index.search(query_vector, k=100) # Text search using BM25 text_results = bm25_index.search(query_text, k=100) # Combine scores combined = {} for id, score in vector_results: combined[id] = alpha * score for id, score in text_results: combined[id] = combined.get(id, 0) + (1-alpha) * score return sorted(combined.items(), key=lambda x: x[1], reverse=True)[:k]

3. Quantization Integration

Combining HNSW with Product Quantization for memory efficiency:

class HNSW_PQ: def __init__(self, d, M, n_subvectors): self.hnsw = HNSW(d, M) self.pq = ProductQuantizer(d, n_subvectors) def add(self, vectors): # Store compressed vectors self.codes = self.pq.encode(vectors) # Build HNSW on PQ codes self.hnsw.add(self.codes) def search(self, query, k): # Compress query query_code = self.pq.encode(query) # Search in compressed space candidates = self.hnsw.search(query_code, k * 10) # Rerank with original vectors return rerank_with_exact_distance(query, candidates, k)

Best Practices

  1. Parameter Tuning

    • Start with M=16, ef_construction=200
    • Increase ef_construction for better quality
    • Increase M for better recall but more memory
  2. Distance Metrics

    • Use L2 for dense embeddings
    • Use cosine for normalized embeddings
    • Consider IP (inner product) for learned embeddings
  3. Monitoring

    • Track recall using ground truth
    • Monitor query latency percentiles
    • Watch memory usage growth
  4. Scaling

    • Shard by vector ID for horizontal scaling
    • Use replicas for query throughput
    • Consider quantization for large datasets

References

If you found this explanation helpful, consider sharing it with others.

Mastodon