The frontier of open-weight model releases
Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.
Filtered results (35)
Models matching your filters. Clear filters by clicking “Any” on each row above, or remove individual filters via the URL.
Qwen 3.6 35B-A3B (MTP)
high-throughput MoE inference at workstation tier
Qwen 3.6 27B (MTP)
dense workstation model with throughput-acceleration
Qwen 3.5 235B-A17B (MoE)
frontier-tier reasoning + multilingual serving on multi-machine clusters
Qwen 3 Coder 32B
coding-specialized agent workloads
Qwen 3 7B
consumer-tier reasoning on 8GB+ GPUs
Qwen 3 Embedding 8B
permissively-licensed embeddings at 8B
Qwen 3 235B-A22B
Qwen 3 MoE flagship — pre-3.5 baseline
Qwen 3 32B
general-purpose reasoning + chat with toggle-style reasoning emission
Qwen 3 30B-A3B
workstation MoE — 3B active, 30B total
Qwen 3 14B
16GB-VRAM reasoning workloads with thinking-mode toggle
Qwen 3 8B
consumer-tier reasoning toggle
Qwen 3 4B
edge-tier Qwen 3 — Apple Silicon laptop friendly
QwQ 32B Preview
workstation-tier reasoning — Qwen team alternative to R1
Qwen 2.5 Coder 32B Instruct
single-user autonomous coding agents on RTX 4090 / 5090 / dual-A100 hardware
Qwen 2.5 Coder 14B Instruct
16GB-VRAM coding
Qwen 2.5 Coder 7B Instruct
consumer-tier coding at 8GB VRAM
Qwen 2.5 Coder 3B
Apple Silicon laptop coding autocomplete
Qwen 2.5 Coder 1.5B
IDE autocomplete on integrated GPUs
Qwen 2.5 72B Instruct
production multilingual at 70B-class
Qwen 2.5 Math 72B
datacenter-tier math specialist
Qwen 2.5 32B Instruct
workstation-tier multilingual general chat
Qwen 2.5 14B Instruct
16GB-VRAM general chat with multilingual depth
Qwen 2.5 7B Instruct
consumer-tier multilingual chat
Qwen 2.5 Math 7B
consumer-tier math problem solving
Qwen 2.5 3B Instruct
edge-tier Qwen 2.5 chat
Qwen 2.5 1.5B Instruct
edge-tier Apache 2.0 chat
Qwen 2.5 0.5B Instruct
phone-tier Qwen baseline
CodeQwen 1.5 7B
historical reference — Qwen 2.5 Coder 7B is the modern pick
Qwen3 Swallow 32B RL v0.2
Japanese-English reasoning tasks: math, coding, structured analysis
Qwen3.5 9B Thai Law Base
Foundation for fine-tuning Thai legal NLP tools
Qwen2-VL 2B Instruct
Lightweight document and chart understanding on a consumer GPU
Qwen 3.5 2B Turkish SFT
Qwen 3 1.7B
Edge laptop assistant with reasoning that fits in 2GB VRAM
Qwen3 0.6B Hindi Instruct v1 GGUF
Simple Hindi instruction following on CPU-only devices
Qwen 3 0.6B
Sub-1B on-device chat and tool-calling agent on phones
Going deeper
- Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
- Execution stacks — recipes that combine models with runtimes + hardware.
- Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
- Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.