Distil-Whisper Large v3
756M-param distilled Whisper-large-v3 with the decoder shrunk from 32 to 2 layers. ~6.3x faster than the teacher at near-parity WER on long-form English (1% absolute gap on out-of-distribution sets per the model card).
Best English ASR/throughput trade-off in the open-source Whisper lineage. Default choice for English-only batch jobs.
Overview
756M-param distilled Whisper-large-v3 with the decoder shrunk from 32 to 2 layers. ~6.3x faster than the teacher at near-parity WER on long-form English (1% absolute gap on out-of-distribution sets per the model card).
Strengths
- ~6x decoder speedup vs whisper-large-v3 at <1% WER regression
- Strong long-form transcription via chunked/sequential algorithms
- First-class transformers + faster-whisper + Whisper.cpp + MLX support
- MIT license, fully commercial
Weaknesses
- English-only — non-English audio degrades sharply
- Decoder is shallow — slightly worse on rare-word recovery than the teacher
- Still inherits Whisper's silence-hallucination behavior
- Needs a GPU or Apple Silicon for true realtime; CPU latency lags Parakeet
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 0.4 GB | 1 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Distil-Whisper Large v3.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run Distil-Whisper Large v3?
Can I use Distil-Whisper Large v3 commercially?
What's the context length of Distil-Whisper Large v3?
Source: huggingface.co/distil-whisper/distil-large-v3
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify Distil-Whisper Large v3 runs on your specific hardware before committing money.