The frontier of open-weight model releases
Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.
Filtered results (8)
Models matching your filters. Clear filters by clicking “Any” on each row above, or remove individual filters via the URL.
Qwen 3.6 35B-A3B (MTP)
high-throughput MoE inference at workstation tier
Qwen 3.6 27B (MTP)
dense workstation model with throughput-acceleration
Qwen 3 Coder 32B
coding-specialized agent workloads
Qwen 3 32B
general-purpose reasoning + chat with toggle-style reasoning emission
Qwen 3 30B-A3B
workstation MoE — 3B active, 30B total
QwQ 32B Preview
workstation-tier reasoning — Qwen team alternative to R1
Qwen 2.5 Coder 32B Instruct
single-user autonomous coding agents on RTX 4090 / 5090 / dual-A100 hardware
Qwen 2.5 32B Instruct
workstation-tier multilingual general chat
Going deeper
- Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
- Execution stacks — recipes that combine models with runtimes + hardware.
- Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
- Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.