paraphrase-multilingual-MiniLM-L12-v2
paraphrase-multilingual-MiniLM-L12-v2 is a 118M-parameter multilingual sentence-transformers embedder built on a knowledge-distilled MiniLM-L12, producing 384-dim vectors across 50+ languages. It is the long-standing default for multilingual semantic similarity work and the multilingual companion to all-MiniLM-L6-v2.
The pragmatic pick when 'multilingual embedding' meets 'sub-150M params.' MTEB-multilingual quality is mediocre by 2026 standards, but the 128-token context and 384-dim output keep it useful for short-passage multilingual RAG on storage-constrained hardware. For commercial multilingual quality, ship Arctic-embed-l-v2 instead.
Overview
paraphrase-multilingual-MiniLM-L12-v2 is a 118M-parameter multilingual sentence-transformers embedder built on a knowledge-distilled MiniLM-L12, producing 384-dim vectors across 50+ languages. It is the long-standing default for multilingual semantic similarity work and the multilingual companion to all-MiniLM-L6-v2.
Strengths
- 50+ language coverage at sub-150M params — best edge-tier multilingual embedder
- 384-dim output, same as MiniLM-L6 — drop-in replacement for English-only pipelines that need to add languages
- Apache-2.0 with no restrictions
- ONNX and Transformers.js exports widely available
Weaknesses
- 128-token context is the shortest of any embedder we list
- MTEB multilingual score (~50.4) trails Arctic-embed-l-v2 and jina-v3 by 10+ points
- Older paraphrase-objective training — less optimized for asymmetric query/doc retrieval
- No Matryoshka support — full 384 dims always required at storage
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 0.1 GB | 1 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of paraphrase-multilingual-MiniLM-L12-v2.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run paraphrase-multilingual-MiniLM-L12-v2?
Can I use paraphrase-multilingual-MiniLM-L12-v2 commercially?
What's the context length of paraphrase-multilingual-MiniLM-L12-v2?
Source: huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify paraphrase-multilingual-MiniLM-L12-v2 runs on your specific hardware before committing money.