RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Vector Database Internals
  6. /Ch. 4
Vector Database Internals

04. IVF: Inverted File Index

Chapter 4 of 18 · 20 min
KEY INSIGHT

IVF trades recall for speed by limiting search scope. If your query lands in the correct cluster region, you get exact results faster. If it lands near cluster boundaries, you miss relevant vectors in neighboring clusters. ### The Inverted Index Structure IVF maintains an inverted list for each cluster. Each inverted list contains the IDs of vectors belonging to that cluster (and optionally their residuals for reconstruction). ```python import numpy as np from sklearn.cluster import MiniBatchKMeans class IVFFlat: """ Simplified IVF index for educational purposes. Production implementations (FAISS, ScaNN) add quantization and pruning. """ def __init__(self, nlist, dim): self.nlist = nlist self.dim = dim self.centroids = None self.inverted_lists = {} # list_id -> list of vector indices def train(self, vectors): """Cluster vectors to build centroids.""" print(f"Training IVF with {self.nlist} clusters...") kmeans = MiniBatchKMeans( n_clusters=self.nlist, batch_size=4096, n_init=3, max_iter=100 ) kmeans.fit(vectors) self.centroids = kmeans.cluster_centers_ # Assign vectors to clusters assignments = kmeans.predict(vectors) self.inverted_lists = {i: [] for i in range(self.nlist)} for idx, cluster_id in enumerate(assignments): self.inverted_lists[cluster_id].append(idx) print(f"Training complete. Cluster sizes: min={min(len(v) for v in self.inverted_lists.values())}, " f"max={max(len(v) for v in self.inverted_lists.values())}") return self def search(self, query, nprobe=1): """Search nprobe closest clusters.""" # Find closest centroids diff = self.centroids - query distances = np.sum(diff ** 2, axis=1) closest_clusters = np.argsort(distances)[:nprobe] # Search vectors in those clusters candidate_indices = [] for cluster_id in closest_clusters: candidate_indices.extend(self.inverted_lists[cluster_id]) # Brute force among candidates candidates = np.array(candidate_indices) diff = self.vectors[candidates] - query dists = np.sum(diff ** 2, axis=1) top_k = np.argsort(dists)[:10] return dists[top_k], candidates[top_k] ``` ### How IVF Reduces Work If you have 1M vectors and 1000 clusters, each cluster has ~1000 vectors. Searching 1 cluster requires 1000 distance computations instead of 1M. With nprobe=10, you search 10k vectors—still 100x faster than brute force. The catch: you only find vectors in the searched clusters. If the true nearest neighbors span multiple clusters, you miss them.

IVF (Inverted File Index) is the workhorse of vector search. Its core idea: partition your vectors into clusters using k-means, then at query time, find which cluster your query belongs to and search only that cluster (plus neighbors).

EXERCISE

Build an IVF index with 100 clusters on 100k vectors. Vary nprobe from 1 to 20 and measure recall (compare against brute force ground truth). Plot recall vs nprobe. Observe how recall saturates—there's a point of diminishing returns.

# Continuing from above...
def evaluate_recall(index, vectors, queries, k=10, nprobe=1):
    """Calculate recall by comparing to brute force."""
    # Ground truth via brute force
    db = index.vectors  # access vectors
    diff = db[:, np.newaxis] - queries[np.newaxis, :, :]
    all_dists = np.sqrt(np.sum(diff ** 2, axis=2))
    gt_indices = np.argsort(all_dists, axis=0)[:k]
    
    # IVF results
    recall_sum = 0
    for i, query in enumerate(queries):
        _, indices = index.search(query, nprobe)
        relevant = set(gt_indices[:, i])
        retrieved = set(indices[:k])
        recall_sum += len(relevant & retrieved) / k
    
    return recall_sum / len(queries)

# Run evaluation
for nprobe in [1, 5, 10, 20, 50]:
    recall = evaluate_recall(index, vectors, queries[:100], k=10, nprobe=nprobe)
    print(f"nprobe={nprobe:2d}: recall={recall:.4f}")
← Chapter 3
Brute Force Search
Chapter 5 →
IVF Training and Search