other
13B parameters
Commercial OK
Reviewed May 2026

mGPT 13B

mGPT-13B is a 13B-parameter GPT-3-style model pretrained on 600 GB of deduplicated text spanning 61 languages across 25 language families, sourced from mC4 and Wikipedia. It is a base model — no instruction tuning, no RLHF. MIT-licensed and commercially usable.

License: mit·Context: 2,048 tokens
BLK · VERDICT

Our verdict

OP · Fredoline Eruo|VERIFIED MAY 28, 2026
9.2/10

If you need a commercially clean base model with real Russian and broader post-Soviet language coverage, mGPT-13B is one of the few honest options at this size. Do not deploy it raw expecting chat or instruction-following behavior — it will disappoint. The 2048-token context is a genuine operational constraint worth planning around. Hedge: worth the VRAM only if you intend to fine-tune or have a clear completion-style use case.

Why this rating

Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.15/10. License is explicitly MIT on the HF card and correctly flagged commercial-ok. Parameter count, vendor, family (gpt2/gpt3-style), and multilingual scope all check out against the card. Context length of 2048 is standard for this GPT-2/3 architecture lineage and is a reasonable default though not explicitly stated in the excerpt — a minor hedge but defensible. The description is honest, concrete, and operator-voiced; weaknesses correctly flag the tight context, lack of instruction tuning, thin community, and unverified low-resource quality. Best use case is sharp (Russian/Turkic/Slavic base for fine-tuning) rather than generic. Brand fit is solid but slightly narrow — this is a fine-tuning substrate, not something a typical local-AI operator runs raw, which the verdict honestly acknowledges.

Flags: - contextLength 2048 not explicitly confirmed in the README excerpt — inferred from GPT-2/3 architecture lineage; should be verified against config.json

Overview

mGPT-13B is a 13B-parameter GPT-3-style model pretrained on 600 GB of deduplicated text spanning 61 languages across 25 language families, sourced from mC4 and Wikipedia. It is a base model — no instruction tuning, no RLHF. MIT-licensed and commercially usable.

Strengths

  • Genuine multilingual coverage: 61 languages, 25 families, including Slavic, Turkic, and Dravidian groups
  • Trained on 600 GB of deduplicated data — not a small or hastily assembled corpus
  • MIT license: no commercial restrictions
  • One of the few open 13B base models with serious Russian-language pretraining

Weaknesses

  • 2048-token context window is tight by current standards — expect hard cutoffs on longer documents
  • No instruction tuning: raw completions only, prompt engineering required for any task-shaped output
  • 1,624 HF downloads suggests thin community support — debugging is largely on you
  • English and high-resource languages likely dominate the corpus; low-resource language quality is unverified beyond perplexity numbers

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

QuantizationFile sizeVRAM required
Q4_K_M7.2 GB10 GB

Get the model

HuggingFace

Original weights

huggingface.co/ai-forever/mGPT-13B

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of mGPT 13B.

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Frequently asked

What's the minimum VRAM to run mGPT 13B?

10GB of VRAM is enough to run mGPT 13B at the Q4_K_M quantization (file size 7.2 GB). Higher-quality quantizations need more.

Can I use mGPT 13B commercially?

Yes — mGPT 13B ships under the mit, which permits commercial use. Always read the license text before deployment.

What's the context length of mGPT 13B?

mGPT 13B supports a context window of 2,048 tokens (about 2K).

Source: huggingface.co/ai-forever/mGPT-13B

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Before you buy

Verify mGPT 13B runs on your specific hardware before committing money.