Approximate Nearest Neighbors (ANN)

Approximate Nearest Neighbors (ANN) is a class of algorithms designed to efficiently retrieve vectors that are close to a query vector in high-dimensional space. It’s widely used in vector search when exact precision isn’t critical but speed and scalability are.

Why ANN?

  • Exact Nearest Neighbor (NN) search is computationally expensive in high-dimensional spaces (curse of dimensionality).
  • ANN trades perfect accuracy for massive speed improvements, enabling real-time applications like semantic search or recommendations.

Core Concepts

  • Nearest Neighbor Search (NNS): Find the most similar vector to a given query vector.
  • Approximate Nearest Neighbor (ANN): Return one of the closest neighbors, not necessarily the exact closest.
  • Top-K Retrieval: Return the top k most similar vectors.
  • HNSW (Hierarchical Navigable Small World): Graph-based; excellent speed/accuracy trade-off.
  • IVF (Inverted File Index): Clusters data into centroids and narrows search to nearby clusters.
  • PQ (Product Quantization): Compresses vectors for memory-efficient indexing.
  • Annoy: Tree-based (used by Spotify); good for memory-constrained use cases.
  • FAISS: Facebook’s ANN library; supports flat, IVF, PQ, and HNSW.

Use Cases

  • Semantic search in vector databases
  • Recommendation engines
  • Image and video similarity
  • RAG pipelines for LLMs
  • Fraud/anomaly detection

Key Trade-offs

Factor Exact NN ANN
Accuracy 100% ~95–99% (configurable)
Speed Slow Fast
Memory Usage Often high Often optimized
Scalability Limited Excellent

Example

You have 10 million embeddings. You need to find the top 5 most similar vectors to a query within 50ms. ANN enables this with minimal loss in precision.