10. HNSW Parameters
The Hierarchical Navigable Small World (HNSW) algorithm controls its behavior through a handful of parameters that directly determine memory usage, build time, and query latency. Understanding these knobs is essential for tuning a production vector database.
The Core Parameters
M (max connections per layer) controls the density of the graph. Higher M values create denser graphs with better recall but consume more memory and slow insertion. Typical values range from 8 to 64. At M=16, expect roughly 0.5 bytes per dimension in memory overhead. At M=64, that jumps to 2-3x.
efConstruction governs the dynamic list size during index building. This parameter directly affects build time and index quality. Higher values produce better graphs but extend build time quadratically. A value of 200-400 is common for production workloads.
efSearch determines the search exploration during queries. Lower values mean faster queries but potentially missed results. Set this based on your recall requirements. If you need 95%+ recall, you may need efSearch=200 or higher, even if 50 is sufficient for 90%.
layerAlpha (in some implementations) controls the exponential decay factor for layer assignment. The standard formula assigns element i to layer l with probability exp(-l * layerAlpha). The default 1/ln(dim) balances skip-list structure with graph connectivity.
Layer Distribution Mathematics
With default parameters, the expected number of layers for n elements follows ln(n). For 10 million vectors, expect approximately 16 layers. The top layer contains only a handful of elements (roughly n/M), while the bottom layer holds the full dataset.
import random
import math
def calculate_layer_distribution(n_vectors: int, m: int, layer_alpha: float = 1.0) -> dict:
"""
Predict layer distribution for HNSW parameters.
Returns expected element count per layer.
"""
layers = {}
total_elements = n_vectors
for layer in range(int(math.log(n_vectors)) + 1):
# Probability an element reaches this layer
p_layer = math.exp(-layer * layer_alpha)
expected_count = n_vectors * p_layer
# Top layer has ~1/M elements
if layer == 0:
expected_count = max(1, total_elements // m)
if expected_count >= 1:
layers[layer] = int(expected_count)
return layers
# Example: 10M vectors, M=16
dist = calculate_layer_distribution(10_000_000, m=16)
for layer, count in dist.items():
print(f"Layer {layer}: ~{count:,} elements")
Memory Estimation
Memory(bytes) ≈ n_vectors × (d × 4 + M × 4 + overhead)
≈ n_vectors × (d × 4 + M × 8) # conservative estimate
For 1M vectors at 768 dimensions with M=32: roughly 4-6 GB.
Failure Modes
Setting M too high causes cache thrashing on large datasets. Setting it too low produces disconnected graph regions, visible as sudden recall drops. Always benchmark recall at your target efSearch value before production deployment.
Create a script that benchmarks HNSW index construction time and memory usage across different M values (8, 16, 32, 64) using 100K random vectors at 128 dimensions. Plot the results to identify the inflection point where memory usage becomes problematic for your hardware.
# Target output: plot showing time and memory vs M value
# Expected insight: finding your hardware's practical M limit