ANN Algorithms Comparison

Choosing the right approximate nearest neighbor (ANN) algorithm is crucial for building efficient vector search systems. This comprehensive comparison helps you understand the tradeoffs and select the best approach.

Interactive Algorithm Comparison

Visualize and compare different ANN algorithms in action:

ANN Algorithm Comparison

Compare different approximate nearest neighbor search algorithms

Hierarchical Navigable Small World

Instructions: Click on the canvas to place a query point. The algorithm will find approximate nearest neighbors using HNSW.

Multi-layer graph with long-range connections

Algorithm Overview

The ANN Landscape

Algorithm	Type	Key Innovation	Best For
HNSW	Graph	Hierarchical layers	High recall, real-time
IVF-PQ	Partition + Compress	Clustering + quantization	Billion-scale, memory-limited
LSH	Hash	Locality-sensitive functions	Streaming, theoretical guarantees
Annoy	Tree	Random projection trees	Static data, moderate scale
ScaNN	Learned	Anisotropic quantization	Google-scale, learned metrics
DiskANN	Graph	SSD-optimized	Larger than memory datasets

Detailed Comparison

Performance Metrics

Algorithm	Build Time	Query Time	Memory	Recall@10	Updates
HNSW	O(N log N)	O(log N)	High (1.5-2× data)	95-99%	Incremental
IVF-PQ	O(N · K)	O(√(N))	Very Low (5-10% data)	85-95%	Batch
LSH	O(N)	O(1) expected	Low (20-50% data)	70-90%	Instant
Annoy	O(N log N)	O(log N)	Medium (1× data)	85-95%	Rebuild
ScaNN	O(N log N)	O(√(N))	Low (10-30% data)	95-98%	Batch

Scalability Analysis

Dataset Size → 
         10K    100K    1M      10M     100M    1B      10B
HNSW     ████   ████   ████    ███     ██      █       -
IVF-PQ   ██     ███    ████    ████    ████    ████    ███
LSH      ███    ████   ████    ████    ████    ███     ██
Annoy    ████   ████   ███     ██      █       -       -
ScaNN    ██     ███    ████    ████    ████    ████    ██
DiskANN  █      ██     ███     ████    ████    ████    ████

Legend: ████ Excellent  ███ Good  ██ Fair  █ Poor  - Not Suitable

Algorithm Deep Dive

HNSW (Hierarchical Navigable Small World)

Strengths:

Highest recall rates (95-99%)
Logarithmic query complexity
Incremental updates
Robust to data distribution

Weaknesses:

High memory usage
Slow index building
No native compression
Cannot delete efficiently

Implementation:

import hnswlib

index = hnswlib.Index(space='l2', dim=128)
index.init_index(max_elements=1000000, M=16, ef_construction=200)
index.add_items(data, ids)
index.set_ef(50)  # ef for search
labels, distances = index.knn_query(queries, k=10)

IVF-PQ (Inverted File + Product Quantization)

Strengths:

Extreme compression (10-100×)
Handles billions of vectors
Fast index building
Tunable accuracy/speed

Weaknesses:

Lower recall than HNSW
Batch updates preferred
Training phase required
Quantization loss

Implementation:

import faiss

index = faiss.index_factory(128, "IVF1024,PQ32", faiss.METRIC_L2)
index.train(training_data)
index.add(data)
index.nprobe = 10  # clusters to search
distances, indices = index.search(queries, k=10)

LSH (Locality Sensitive Hashing)

Strengths:

Instant updates
Theoretical guarantees
Simple implementation
Works in streaming

Weaknesses:

Lower recall
Many parameters to tune
Sensitive to data distribution
Memory overhead from tables

Implementation:

from datasketch import MinHashLSH

lsh = MinHashLSH(threshold=0.5, num_perm=128)
for key, minhash in data:
    lsh.insert(key, minhash)
result = lsh.query(query_minhash)

Annoy (Approximate Nearest Neighbors Oh Yeah)

Strengths:

Simple and fast
Memory mapped files
Good for static data
Easy to use

Weaknesses:

No updates (rebuild required)
Limited to moderate scale
Tree imbalance issues
Build time variance

Implementation:

from annoy import AnnoyIndex

index = AnnoyIndex(128, 'angular')
for i, vector in enumerate(data):
    index.add_item(i, vector)
index.build(10)  # 10 trees
neighbors = index.get_nns_by_vector(query, 10)

ScaNN (Scalable Nearest Neighbors)

Strengths:

State-of-the-art recall/speed
Learned quantization
Optimized for dot product
Google-scale tested

Weaknesses:

Complex configuration
Training required
Limited language support
Newer, less mature

Implementation:

import scann

searcher = scann.scann_ops_pybind.builder(data, 10, "dot_product").tree(
    num_leaves=2000, num_leaves_to_search=100
).score_ah(2, anisotropic_quantization_threshold=0.2).build()

neighbors, distances = searcher.search_batched(queries)

Benchmark Results

ANN-Benchmarks Results (SIFT-1M Dataset)

Recall@10 vs Queries/Second (Higher is Better)

         1K QPS   10K QPS   100K QPS
HNSW     99.5%    97.2%     92.1%
IVF-PQ   95.3%    89.7%     78.2%
LSH      87.1%    82.3%     76.5%
Annoy    94.2%    88.5%     81.3%
ScaNN    98.7%    95.1%     88.9%

Memory Efficiency (768-dim, 10M vectors)

Algorithm	Index Size	Compression Ratio	Build Time
Original	30 GB	1×	-
HNSW	45 GB	0.67×	2 hours
IVF-PQ	0.3 GB	100×	30 min
LSH	6 GB	5×	10 min
Annoy	30 GB	1×	1 hour
ScaNN	3 GB	10×	45 min

Decision Matrix

Choose HNSW When:

if (recall_requirement > 0.95 and 
    dataset_size < 10_000_000 and
    memory_available > dataset_size * 2 and
    need_incremental_updates):
    return "HNSW"

Choose IVF-PQ When:

if (dataset_size > 100_000_000 and
    memory_constrained and
    recall_requirement > 0.85 and
    batch_updates_ok):
    return "IVF-PQ"

Choose LSH When:

if (streaming_data or
    need_theoretical_guarantees or
    very_high_dimensions > 1000 or
    need_instant_updates):
    return "LSH"

Choose Annoy When:

if (static_dataset and
    moderate_size < 10_000_000 and
    simplicity_important and
    memory_mapped_needed):
    return "Annoy"

Choose ScaNN When:

if (google_scale_data and
    dot_product_similarity and
    can_afford_training and
    need_best_recall_speed_tradeoff):
    return "ScaNN"

Hybrid Approaches

1. Hierarchical Index

Combine multiple algorithms for different scales:

class HybridIndex:
    def __init__(self, threshold=100000):
        self.threshold = threshold
        self.coarse_index = None  # IVF for clustering
        self.fine_indices = {}    # HNSW per cluster
    
    def build(self, data):
        # Coarse clustering
        self.coarse_index = faiss.IndexIVFFlat(...)
        cluster_ids = self.coarse_index.quantizer.search(data)[1]
        
        # Fine indexing per cluster
        for cluster_id in unique(cluster_ids):
            cluster_data = data[cluster_ids == cluster_id]
            if len(cluster_data) > self.threshold:
                # Use IVF-PQ for large clusters
                self.fine_indices[cluster_id] = build_ivfpq(cluster_data)
            else:
                # Use HNSW for small clusters
                self.fine_indices[cluster_id] = build_hnsw(cluster_data)

2. Reranking Pipeline

Use fast approximate search then exact reranking:

def search_with_rerank(query, k=10, candidates_multiplier=10):
    # Stage 1: Fast approximate search
    candidates = ivfpq_index.search(query, k * candidates_multiplier)
    
    # Stage 2: Exact reranking
    exact_distances = compute_exact_distances(query, candidates)
    
    # Return top-k after reranking
    return top_k(candidates, exact_distances, k)

3. Filtered Search

Combine vector search with metadata filtering:

class FilteredANN:
    def search(self, vector_query, metadata_filter, k=10):
        # Over-retrieve to account for filtering
        candidates = self.ann_index.search(vector_query, k * 5)
        
        # Apply metadata filter
        filtered = []
        for candidate_id in candidates:
            if metadata_filter(self.metadata[candidate_id]):
                filtered.append(candidate_id)
                if len(filtered) >= k:
                    break
        
        return filtered

Production Considerations

1. Index Building Strategy

def build_index_strategy(num_vectors, vector_dim, memory_budget):
    data_size = num_vectors * vector_dim * 4  # float32
    
    if memory_budget > data_size * 2:
        return "HNSW"  # Can afford full index
    elif memory_budget > data_size * 0.5:
        return "Annoy"  # Moderate memory
    elif memory_budget > data_size * 0.1:
        return "ScaNN"  # Compressed but good recall
    else:
        return "IVF-PQ"  # Maximum compression

2. Serving Architecture

# Microservices approach
services:
  index_builder:
    algorithm: IVF-PQ
    schedule: nightly
    resources: high_memory
  
  search_service:
    replicas: 10
    cache: redis
    fallback: exact_search
  
  reranker:
    algorithm: exact
    batch_size: 100
    gpu_enabled: true

3. Monitoring Metrics

metrics_to_track = {
    'recall@k': measure_recall,
    'latency_p50': percentile(latencies, 50),
    'latency_p99': percentile(latencies, 99),
    'qps': queries_per_second,
    'index_size': get_index_memory(),
    'build_time': measure_build_time(),
    'update_latency': measure_update_time()
}

Cost Analysis

Cloud Deployment Costs (AWS, 10M vectors, 768-dim)

Algorithm	Instance Type	Storage	Monthly Cost	QPS
HNSW	r5.4xlarge	60 GB EBS	$650	10,000
IVF-PQ	c5.2xlarge	5 GB EBS	$250	8,000
LSH	c5.4xlarge	15 GB EBS	$350	15,000
Annoy	r5.2xlarge	40 GB EBS	$400	7,000
ScaNN	c5.4xlarge	10 GB EBS	$320	12,000