Qwen 3 0.6B
Qwen3-0.6B is the smallest dense model in Alibaba's Qwen3 generation, supporting a 40K-token context and dual-mode operation that toggles between explicit reasoning ('think') and fast direct response. It is post-trained for instruction following, tool calling, and multilingual chat across 100+ languages.
The new default for 'I need a chatbot that fits in a browser tab.' Qwen3-0.6B is the most downloaded SLM on Hugging Face for a reason: Apache-2.0, 40K context, working tool-call template, and a real reasoning toggle in a 1.2GB footprint.
Overview
Qwen3-0.6B is the smallest dense model in Alibaba's Qwen3 generation, supporting a 40K-token context and dual-mode operation that toggles between explicit reasoning ('think') and fast direct response. It is post-trained for instruction following, tool calling, and multilingual chat across 100+ languages.
Strengths
- Apache-2.0 license clears all commercial deployment
- 40K native context is unusually large for a sub-1B model
- Hybrid thinking/non-thinking mode lets you trade latency for reasoning quality
- Massive HF adoption (~19M downloads) means broad GGUF/MLX/ONNX coverage
Weaknesses
- 0.6B parameters caps factual recall and complex reasoning vs. 2-3B peers
- Thinking-mode traces can blow your token budget on edge devices
- No vision or audio modality
- Tokenizer is heavy (151K vocab) for such a small model
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 0.3 GB | 1 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Qwen 3 0.6B.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run Qwen 3 0.6B?
Can I use Qwen 3 0.6B commercially?
What's the context length of Qwen 3 0.6B?
Source: huggingface.co/Qwen/Qwen3-0.6B
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify Qwen 3 0.6B runs on your specific hardware before committing money.