other

0.56B parameters

Commercial OK

Reviewed May 2026

Multilingual E5 Large Instruct

Multilingual E5 Large Instruct is a 560M-parameter XLM-RoBERTa-large encoder fine-tuned by Microsoft's intfloat team with task instructions appended to queries, producing 1024-dim embeddings across 100 languages. It scores ~64.4 on the multilingual MTEB and remains the MIT-licensed default for cross-lingual retrieval at sub-1B parameters.

License: mit·Context: 514 tokens

BLK · VERDICT

Our verdict

OP · Fredoline Eruo|VERIFIED MAY 29, 2026

unrated

Still the workhorse for multilingual retrieval when you need MIT licensing and aren't constrained by the 514-token context. The instruction prefix design influenced every later embedder. For long-document multilingual work in 2026, prefer Arctic-embed-l-v2; otherwise this is the proven default.

Overview

Strengths

1024-dim multilingual embeddings covering 100 languages with MTEB-Multi ~64.4
MIT license — no commercial restrictions, unlike jina-v3
Instruction-conditioned queries enable task switching without retraining
Mature ecosystem: shipped in sentence-transformers, Elasticsearch, OpenSearch, Vespa, llama.cpp

Weaknesses

Only 514-token context (XLM-RoBERTa cap) — every multi-paragraph document needs chunking
Trails Arctic-embed-l-v2 and jina-v3 on long-document multilingual retrieval due to context limit
1024 dims with no Matryoshka — full vector required at storage tier
Instruction prefix discipline is mandatory; embeddings collapse if omitted on the query side

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

Quantization	File size	VRAM required
Q4_K_M	0.3 GB	1 GB

Get the model

HuggingFace

Original weights

huggingface.co/intfloat/multilingual-e5-large-instruct

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of Multilingual E5 Large Instruct.

NVIDIA B300 (Blackwell Ultra)

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Same tier

Models in the same parameter band as this one

Step up

More capable — bigger memory footprint

Step down

Smaller — faster, runs on weaker hardware

No verdicted models in the next tier down yet.

Frequently asked

What's the minimum VRAM to run Multilingual E5 Large Instruct?

1GB of VRAM is enough to run Multilingual E5 Large Instruct at the Q4_K_M quantization (file size 0.3 GB). Higher-quality quantizations need more.

Can I use Multilingual E5 Large Instruct commercially?

Yes — Multilingual E5 Large Instruct ships under the mit, which permits commercial use. Always read the license text before deployment.

What's the context length of Multilingual E5 Large Instruct?

Multilingual E5 Large Instruct supports a context window of 514 tokens (about 1K).

Source: huggingface.co/intfloat/multilingual-e5-large-instruct

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Recommended hardware

Before you buy

Verify Multilingual E5 Large Instruct runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →