08. HNSW Construction
Chapter 8 of 18 · 25 min
Building HNSW involves inserting vectors one at a time, finding their position via search at each layer, and connecting to nearby neighbors. The order of insertion affects the final structure.
EXERCISE
Build HNSW indexes with different insertion orders (sequential, random, sorted by L2 norm). Measure search quality for each. Observe how insertion order affects the final graph.
def compare_insertion_orders(vectors, n_queries=100):
"""Compare HNSW built with different insertion orders."""
results = {}
orders = {
'sequential': vectors,
'random': vectors[np.random.permutation(len(vectors))],
'sorted_norm': vectors[np.argsort(np.linalg.norm(vectors, axis=1))]
}
queries = np.random.randn(n_queries, vectors.shape[1])
gt = brute_force_ground_truth(vectors, queries, k=10)
for name, ordered_vectors in orders.items():
hnsw = build_hnsw_batched(ordered_vectors, m=16, ef_construction=100)
recall = measure_hnsw_recall(hnsw, queries, gt, k=10)
results[name] = recall
print(f"{name}: recall@10 = {recall:.4f}")
return results
# Run comparison
results = compare_insertion_orders(vectors[:20000])