Approximate Nearest Neighbors (ANN)

Approximate Nearest Neighbors (ANN) is a class of algorithms designed to efficiently retrieve vectors that are close to a query vector in high-dimensional space. It’s widely used in vector search when exact precision isn’t critical but speed and scalability are.

Why ANN?

Exact Nearest Neighbor (NN) search is computationally expensive in high-dimensional spaces (curse of dimensionality).
ANN trades perfect accuracy for massive speed improvements, enabling real-time applications like semantic search or recommendations.

Core Concepts

Nearest Neighbor Search (NNS): Find the most similar vector to a given query vector.
Approximate Nearest Neighbor (ANN): Return one of the closest neighbors, not necessarily the exact closest.
Top-K Retrieval: Return the top k most similar vectors.

Popular ANN Algorithms

HNSW (Hierarchical Navigable Small World): Graph-based; excellent speed/accuracy trade-off.
IVF (Inverted File Index): Clusters data into centroids and narrows search to nearby clusters.
PQ (Product Quantization): Compresses vectors for memory-efficient indexing.
Annoy: Tree-based (used by Spotify); good for memory-constrained use cases.
FAISS: Facebook’s ANN library; supports flat, IVF, PQ, and HNSW.

Use Cases

Semantic search in vector databases
Recommendation engines
Image and video similarity
RAG pipelines for LLMs
Fraud/anomaly detection

Key Trade-offs

Factor	Exact NN	ANN
Accuracy	100%	~95–99% (configurable)
Speed	Slow	Fast
Memory Usage	Often high	Often optimized
Scalability	Limited	Excellent

Example

You have 10 million embeddings. You need to find the top 5 most similar vectors to a query within 50ms. ANN enables this with minimal loss in precision.

Links to this note

Top-K Retrieval

Top-K retrieval refers to the process of returning the K most relevant or closest results from a dataset in response to a query. In vector search

Vector Databases

Vector databases are specialized data stores designed to handle high-dimensional vector representations—typically generated from embeddings—and