mistral
22B parameters
Restricted

Codestral 22B

Mistral's coding-specialist. Strong fill-in-the-middle for IDE autocompletion. Personal/research use only.

License: Mistral Non-Production License·Released May 29, 2024·Context: 32,768 tokens
Our verdict
By Fredoline Eruo·Last verified May 6, 2026
7.9/10
Positioning

Codestral 22B is Mistral's dedicated coding model. The headline feature was strong fill-in-the-middle on a wide language matrix (80+ languages) — still relevant, especially for non-mainstream language work where Qwen 2.5 Coder's Python/JS bias shows.

Strengths
  • Mistral Non-Production License is more permissive than Qwen license for non-commercial / research / internal use.
  • Strong on long-tail languages — Lua, Erlang, Haskell, Clojure handled better than Qwen 2.5 Coder.
  • Low VRAM for a coder — 13 GB at Q4_K_M fits comfortably on 16 GB cards.
Limitations
  • License is not Apache — commercial deployment requires a separate Mistral commercial license.
  • Qwen 2.5 Coder 32B is materially stronger on the mainstream language pairs (Python, JS, TS, Go).
  • No FIM-via-chat ergonomics — works best with editor plugins that issue raw FIM requests.
Real-world performance on RTX 4090
  • Q4_K_M (13 GB): 70–88 tok/s decode, TTFT ~110 ms — full GPU
  • Q5_K_M (15.4 GB): 60–74 tok/s
  • Q8_0 (23.3 GB): 40–50 tok/s
Should you run this locally?

Yes, for non-commercial coding work in long-tail languages, or 16 GB GPU owners who want a dedicated coder without partial-offload. No, for mainstream Python/JS coding where Qwen 2.5 Coder 32B is materially stronger, or for commercial deployment without the separate Mistral commercial license.

How it compares
  • vs Qwen 2.5 Coder 32B → Qwen wins on capability for Python/JS/TS; Codestral wins on long-tail language coverage and license simplicity for non-commercial use.
  • vs DeepSeek Coder V2 Lite (16B) → Codestral 22B is stronger in absolute capability; DeepSeek Coder V2 Lite uses less VRAM.
  • vs CodeGemma 7B → Codestral 22B is much more capable; CodeGemma is the right pick under 8 GB VRAM.
Run this yourself
ollama pull codestral:22b-v0.1-q4_K_M
ollama run codestral:22b-v0.1-q4_K_M
Settings: Q4_K_M GGUF, 32768 ctx, full GPU on RTX 4080 / 4090 Editor: Continue.dev with FIM endpoint enabled
Why this rating

7.9/10 — Mistral's coding specialist. Excellent fill-in-the-middle, license is the cleanest in the dedicated-coder space, but Qwen 2.5 Coder 32B has decisively overtaken it on raw capability. Loses points for being one tier behind on quality.

Overview

Mistral's coding-specialist. Strong fill-in-the-middle for IDE autocompletion. Personal/research use only.

Strengths

  • Fast FIM completion
  • 80+ languages

Weaknesses

  • Not for commercial use
  • Behind Qwen 2.5 Coder on most benchmarks

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

QuantizationFile sizeVRAM required
Q4_K_M13.0 GB16 GB
Q8_023.0 GB26 GB

Get the model

Ollama

One-line install

ollama run codestral:22bRead our Ollama review →

HuggingFace

Original weights

huggingface.co/mistralai/Codestral-22B-v0.1

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of Codestral 22B.

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Frequently asked

What's the minimum VRAM to run Codestral 22B?

16GB of VRAM is enough to run Codestral 22B at the Q4_K_M quantization (file size 13.0 GB). Higher-quality quantizations need more.

Can I use Codestral 22B commercially?

Codestral 22B is released under the Mistral Non-Production License, which has restrictions for commercial use. Review the license terms before using it in a product.

What's the context length of Codestral 22B?

Codestral 22B supports a context window of 32,768 tokens (about 33K).

How do I install Codestral 22B with Ollama?

Run `ollama pull codestral:22b` to download, then `ollama run codestral:22b` to start a chat session. The default quantization is Q4_K_M.

Source: huggingface.co/mistralai/Codestral-22B-v0.1

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.