Piper
VITS-based neural TTS optimized for Raspberry Pi-class hardware. Ships as ONNX checkpoints with ~100 voices across 30+ languages. Powers Home Assistant's local voice stack and is the de facto open TTS for embedded devices.
Not the prettiest voice, but unbeaten on footprint and language breadth at the edge. If it runs on a Pi and needs to speak, it should probably be Piper.
Overview
VITS-based neural TTS optimized for Raspberry Pi-class hardware. Ships as ONNX checkpoints with ~100 voices across 30+ languages. Powers Home Assistant's local voice stack and is the de facto open TTS for embedded devices.
Strengths
- Runs realtime on Raspberry Pi 4 / Pi Zero 2 — true edge deployment
- 100+ voices across 30+ languages, all MIT-licensed
- Pure ONNX runtime — trivial to embed in C/C++/Rust/Python apps
- Battle-tested in Home Assistant and offline accessibility deployments
Weaknesses
- Per-voice models are small VITS networks — quality lags Kokoro/XTTS noticeably
- No voice cloning; voices are baked at training time
- Limited prosody and emotion control
- Robotic-sounding on long sentences and complex punctuation
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 0.0 GB | 1 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Piper.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run Piper?
Can I use Piper commercially?
What's the context length of Piper?
Source: huggingface.co/rhasspy/piper-voices
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify Piper runs on your specific hardware before committing money.