Hybrid Retrieval

Hybrid retrieval combines dense and sparse retrieval, typically by union-then-rerank or reciprocal rank fusion (RRF). The motivation: dense captures semantic similarity, sparse catches exact-token matches; together they cover failure modes neither has alone.

In practice, hybrid often wins by 5–15% NDCG@10 over the best of the two on diverse corpora. The cost is operational — you maintain two indexes and need a fusion strategy.

Most production RAG systems (LlamaIndex, LangChain, Weaviate) ship hybrid as a default option.

Related terms

See also