gpt2-base-french
A 124M-parameter GPT-2 base model trained on French Wikipedia (wiki40b/fr) and a CC-100/fr subset, with a 50,000-token BPE vocabulary. It generates French text but has no instruction-following capability. Context window is capped at 1024 tokens.
This is a bare GPT-2 base model — useful as a research baseline or a starting point for fine-tuning, not as a drop-in solution for any user-facing task. The 124M size means it runs anywhere, but output quality will reflect that. The non-commercial license rules it out for most production deployments. Skip unless you specifically need a lightweight French-language base model to fine-tune on your own data.
›Why this rating
Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.00/10. The row is technically clean: license matches HF metadata, params and context are correct for GPT-2 base, and the description honestly flags non-commercial restriction and lack of instruction-tuning. Editorial voice is appropriately skeptical and the verdict tells readers to skip unless they specifically need a French base to fine-tune. However, brand fit is weak — a 124M raw GPT-2 base with no instruction tuning, CC-BY-SA non-commercial limitation, and only 7k downloads is a research artifact, not something a local-AI operator would deploy. Modern French alternatives (Mistral, Croissant, etc.) dominate this niche. The overall sits just under the 9.0 bar primarily on brand fit.
Flags: - Marginal brand fit: raw GPT-2 base with no instruction tuning has limited utility for runlocalai's operator audience - CC-BY-SA-4.0 is share-alike, not strictly non-commercial — 'commercial use prohibited' is slightly overstated; commercial use is allowed if derivatives are also CC-BY-SA - Low community validation (7k downloads, 6 likes) for a model that has existed for years
Overview
A 124M-parameter GPT-2 base model trained on French Wikipedia (wiki40b/fr) and a CC-100/fr subset, with a 50,000-token BPE vocabulary. It generates French text but has no instruction-following capability. Context window is capped at 1024 tokens.
Strengths
- Tiny footprint at 124M params — runs on virtually any hardware
- Trained on two distinct French corpora (wiki40b/fr + CC-100/fr)
- 50k BPE vocabulary sized for French, not adapted from an English tokenizer
- CC-BY-SA-4.0 license is clear, even if commercial use is excluded
Weaknesses
- Not instruction-tuned — will not follow prompts or answer questions reliably
- 1024-token context is short by current standards
- Commercial use prohibited under CC-BY-SA-4.0
- 7k downloads and 6 likes suggest minimal community validation or real-world testing
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 0.1 GB | 1 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of gpt2-base-french.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run gpt2-base-french?
Can I use gpt2-base-french commercially?
What's the context length of gpt2-base-french?
Source: huggingface.co/ClassCat/gpt2-base-french
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify gpt2-base-french runs on your specific hardware before committing money.