Whisper Base
74M-parameter Whisper variant — roughly 2x the params of tiny for ~25-30% relative WER reduction. The standard pick for CPU realtime transcription with acceptable quality.
The pragmatic default. If you don't know which Whisper to ship, ship base — then benchmark against distil-large-v3 if you have a GPU budget.
Overview
74M-parameter Whisper variant — roughly 2x the params of tiny for ~25-30% relative WER reduction. The standard pick for CPU realtime transcription with acceptable quality.
Strengths
- Best size/accuracy trade-off in the Whisper family for CPU inference
- Same 99-language coverage as larger Whisper variants
- Apache-2.0, drop-in for whisper.cpp and faster-whisper
- Quantizes well to int8 with minimal WER loss
Weaknesses
- Still hallucinates on silence/music — needs VAD pre-filter
- Non-English WER lags large variants significantly
- 30s chunking required for long-form
- Slower than Distil-Whisper at equivalent accuracy targets
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 0.0 GB | 1 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Whisper Base.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run Whisper Base?
Can I use Whisper Base commercially?
What's the context length of Whisper Base?
Source: huggingface.co/openai/whisper-base
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify Whisper Base runs on your specific hardware before committing money.