GPT-NeoX 20B
GPT-NeoX-20B is a 20B-parameter English autoregressive model from EleutherAI, trained on the 825 GiB Pile dataset. It uses a GPT-3-style transformer architecture and ships under Apache 2.0. There is no instruction tuning or chat fine-tuning — this is a raw base model.
If you're in the Korean hub looking for a production-ready or conversational model, skip this — it has zero Korean language support and no instruction tuning. The 2048-token context and age of the model (2022) mean newer alternatives outperform it with less VRAM. It still has value as a fine-tuning base if you specifically need an Apache-licensed 20B English foundation, but that's a narrow use case. Hedge: only worth the ~40 GB VRAM if you have a specific fine-tuning experiment in mind.
›Why this rating
Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.10/10. License (Apache 2.0) is explicitly verified in the HF card with commercial use permitted. Metadata (20B params, 2048 context, EleutherAI vendor, GPT-NeoX architecture) all match the model card precisely. The description and verdict are honest, operator-voiced, and correctly flag this as a base model with no Korean support and outdated context length. The useCases tag listing 'korean' is contradictory with the row's own honest assessment that it has zero Korean capability — this is a real concern but the verdict explicitly warns Korean-hub readers away, which is the right editorial call. Brand fit is moderate: a 2022-era 40GB English base model is niche for local-AI builders, but the row honestly scopes it to fine-tuning research.
Flags: - useCases includes 'korean' which directly contradicts the row's own weakness ('English-only — no Korean language capability') — should be removed - Narrow practical audience: most runlocalai readers won't fine-tune a 20B base model; verdict appropriately hedges this
Overview
GPT-NeoX-20B is a 20B-parameter English autoregressive model from EleutherAI, trained on the 825 GiB Pile dataset. It uses a GPT-3-style transformer architecture and ships under Apache 2.0. There is no instruction tuning or chat fine-tuning — this is a raw base model.
Strengths
- 20B parameters with broad English knowledge from the Pile dataset
- Apache 2.0 license — commercial use permitted with no restrictions
- Reasonable zero-shot NLP benchmark performance for a 2021-era open model
- Well-documented and widely tested; 582k+ HF downloads
Weaknesses
- English-only — no Korean language capability
- 2048-token context is short by current standards
- No instruction following or chat alignment — requires prompting expertise or fine-tuning
- Largely superseded by newer open models at similar or smaller parameter counts
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 11.0 GB | 14 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of GPT-NeoX 20B.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run GPT-NeoX 20B?
Can I use GPT-NeoX 20B commercially?
What's the context length of GPT-NeoX 20B?
Source: huggingface.co/EleutherAI/gpt-neox-20b
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify GPT-NeoX 20B runs on your specific hardware before committing money.