SmolLM2 360M Instruct
SmolLM2-360M-Instruct is the middle tier of the SmolLM2 instruct family, a 360M-parameter Llama-architecture model with an 8K context. It is shipped with ONNX and Transformers.js artifacts and aimed at on-device assistants that need more capability than the 135M can deliver.
A sensible step up from 135M when you have a bit more silicon. The most defensible choice when you need to fine-tune a tiny model on private data and ship it on hardware without a GPU.
Overview
SmolLM2-360M-Instruct is the middle tier of the SmolLM2 instruct family, a 360M-parameter Llama-architecture model with an 8K context. It is shipped with ONNX and Transformers.js artifacts and aimed at on-device assistants that need more capability than the 135M can deliver.
Strengths
- Roughly 2-3x more useful than the 135M for the same deployment class
- Apache-2.0, fully open training pipeline
- Multiple quantizations (q4f16, q8, bnb4) prebuilt in the repo
- Tiny enough to fit on a Raspberry Pi 5 with headroom
Weaknesses
- Still trails Qwen3-0.6B on most benchmarks despite similar size
- 8K context, no GQA optimizations beyond stock Llama
- Limited community fine-tunes compared to Qwen/Llama tiers
- No native tool-use template
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 0.2 GB | 1 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of SmolLM2 360M Instruct.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run SmolLM2 360M Instruct?
Can I use SmolLM2 360M Instruct commercially?
What's the context length of SmolLM2 360M Instruct?
Source: huggingface.co/HuggingFaceTB/SmolLM2-360M-Instruct
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify SmolLM2 360M Instruct runs on your specific hardware before committing money.