BGE (BAAI General Embedding)

by BAAI (Beijing Academy of AI)

BAAI's open-weight embedding family. BGE-M3 is the canonical multilingual embedding model in 2026 (100+ languages, 8K context); BGE Reranker V2 M3 is the canonical companion cross-encoder reranker. The default open-weight RAG retrieval stack.

Best entry point for local use

Start with BGE-M3 via sentence-transformers on any GPU — BGE-M3 is the best open-weight multilingual embedding model, generating 1024-dim embeddings that support dense retrieval, sparse (lexical) retrieval, and multi-vector (ColBERT-style) retrieval in a single model. It covers 100+ languages and achieves the highest MTEB retrieval score among open-weight embedding models at its size (568M params). The model is small: FP16 ~1.1 GB VRAM — runs on any GPU with 2 GB+ VRAM including integrated graphics. For English-only retrieval, BGE-large-en-v1.5 (335M params, 1024-dim) outperforms BGE-M3 on English MTEB by ~1.5 points at half the size. For re-ranking, BGE-Reranker-v2-m3 is the companion cross-encoder re-ranker — pair M3 retrieval + M3 re-rank for two-stage retrieval pipelines. MIT license.

Deployment guidance

For single-user RAG: sentence-transformers + BGE-M3 FP16 on RTX 3060 12GB — ~200 docs/second encoding throughput, 8K token max input. For production serving: Infinity embedding server with BGE-M3 on L4 24 GB — serves ~500 embeddings/second at batch 32 with continuous batching. For CPU-only: llama.cpp embedding server with BGE-M3 GGUF Q8_0 on Apple M3 — ~80 embeddings/second via Metal. For multi-stage RAG: deploy BGE-M3 for first-pass dense retrieval, then BGE-Reranker-v2-m3 as cross-encoder re-ranker on top-100 candidates — this two-stage combo achieves +12% nDCG@10 vs dense-only on BEIR. For sparse retrieval: BGE-M3 outputs sparse token weights natively (no separate BM25 index needed) — use the sparse_vector output for hybrid dense+sparse retrieval (another +5% nDCG@10). BGE-M3 uses the standard BERT-base tokenizer — input truncation at 8192 tokens.

Featured models

Models in this family with our verdicts

BGE M3 BGE Reranker v2 M3

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Before you buy

Verify BGE (BAAI General Embedding) runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →