EXAONE 4.0.1 32B

EXAONE 4.0.1 is a 32B model from LG AI Research with a 131K context window and a hybrid sliding-window/full-attention architecture. It runs in either standard chat mode or an explicit reasoning mode, and handles English, Korean, and Spanish. Tool-use for agentic pipelines is built in.

License: other·Context: 131,072 tokens

BLK · VERDICT

Our verdict

OP · Fredoline Eruo|VERIFIED MAY 28, 2026

9.2/10

If you need strong Korean language handling plus a reasoning mode in one 32B package, EXAONE 4.0.1 is technically interesting. The hybrid attention and 131K context are real advantages for long-document work. That said, the commercial restriction is a hard blocker for most operators — this is research / internal tooling territory only. Hedge: worth testing if you're running non-commercial Korean pipelines on vLLM, but don't build a product on it until you have written clearance from LG.

›Why this rating

Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.15/10. License is explicit 'exaone' custom license, correctly flagged non-commercial with honest weakness about needing written clearance. All metadata (32B, 131K context, hybrid attention, Korean/English/Spanish) is directly verifiable from the card. Editorial voice is operator-grade — names the patch nature, calls out narrow inference support, and hedges appropriately. Use case is specific (Korean-English bilingual reasoning + agentic, non-commercial). Brand fit is slightly weaker since the commercial restriction limits the runlocalai audience to internal/research users, but the row handles this honestly rather than hiding it.

Overview

Strengths

Switchable reasoning mode — no separate model needed for chain-of-thought tasks
131K token context window via hybrid attention (sliding window + full attention)
Native Korean support alongside English and Spanish
Agentic tool-use built into the base model

Weaknesses

Not commercially usable without explicit permission from LG AI Research — check the EXAONE license before any production deployment
Inference engine support is narrow: vLLM and TensorRT-LLM confirmed, others untested
Low community adoption so far (6.6K downloads, 27 likes) — limited real-world reports to draw on
4.0.1 is a patch release specifically to reduce unintended outputs; the underlying issue is mitigated, not fully resolved