exaone
32B parameters
Restricted
Reviewed May 2026

EXAONE 4.0.1 32B

EXAONE 4.0.1 is a 32B model from LG AI Research with a 131K context window and a hybrid sliding-window/full-attention architecture. It runs in either standard chat mode or an explicit reasoning mode, and handles English, Korean, and Spanish. Tool-use for agentic pipelines is built in.

License: other·Context: 131,072 tokens
BLK · VERDICT

Our verdict

OP · Fredoline Eruo|VERIFIED MAY 28, 2026
9.2/10

If you need strong Korean language handling plus a reasoning mode in one 32B package, EXAONE 4.0.1 is technically interesting. The hybrid attention and 131K context are real advantages for long-document work. That said, the commercial restriction is a hard blocker for most operators — this is research / internal tooling territory only. Hedge: worth testing if you're running non-commercial Korean pipelines on vLLM, but don't build a product on it until you have written clearance from LG.

Why this rating

Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.15/10. License is explicit 'exaone' custom license, correctly flagged non-commercial with honest weakness about needing written clearance. All metadata (32B, 131K context, hybrid attention, Korean/English/Spanish) is directly verifiable from the card. Editorial voice is operator-grade — names the patch nature, calls out narrow inference support, and hedges appropriately. Use case is specific (Korean-English bilingual reasoning + agentic, non-commercial). Brand fit is slightly weaker since the commercial restriction limits the runlocalai audience to internal/research users, but the row handles this honestly rather than hiding it.

Overview

EXAONE 4.0.1 is a 32B model from LG AI Research with a 131K context window and a hybrid sliding-window/full-attention architecture. It runs in either standard chat mode or an explicit reasoning mode, and handles English, Korean, and Spanish. Tool-use for agentic pipelines is built in.

Strengths

  • Switchable reasoning mode — no separate model needed for chain-of-thought tasks
  • 131K token context window via hybrid attention (sliding window + full attention)
  • Native Korean support alongside English and Spanish
  • Agentic tool-use built into the base model

Weaknesses

  • Not commercially usable without explicit permission from LG AI Research — check the EXAONE license before any production deployment
  • Inference engine support is narrow: vLLM and TensorRT-LLM confirmed, others untested
  • Low community adoption so far (6.6K downloads, 27 likes) — limited real-world reports to draw on
  • 4.0.1 is a patch release specifically to reduce unintended outputs; the underlying issue is mitigated, not fully resolved

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

QuantizationFile sizeVRAM required
Q4_K_M17.6 GB23 GB

Get the model

HuggingFace

Original weights

huggingface.co/LGAI-EXAONE/EXAONE-4.0.1-32B

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of EXAONE 4.0.1 32B.

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Frequently asked

What's the minimum VRAM to run EXAONE 4.0.1 32B?

23GB of VRAM is enough to run EXAONE 4.0.1 32B at the Q4_K_M quantization (file size 17.6 GB). Higher-quality quantizations need more.

Can I use EXAONE 4.0.1 32B commercially?

EXAONE 4.0.1 32B is released under the other, which has restrictions for commercial use. Review the license terms before using it in a product.

What's the context length of EXAONE 4.0.1 32B?

EXAONE 4.0.1 32B supports a context window of 131,072 tokens (about 131K).

Source: huggingface.co/LGAI-EXAONE/EXAONE-4.0.1-32B

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Before you buy

Verify EXAONE 4.0.1 32B runs on your specific hardware before committing money.