The frontier of open-weight model releases
Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.
Filtered results (22)
Models matching your filters. Clear filters by clicking “Any” on each row above, or remove individual filters via the URL.
Qwen 3.6 35B-A3B (MTP)
high-throughput MoE inference at workstation tier
Qwen 3.6 27B (MTP)
dense workstation model with throughput-acceleration
OLMo 2 32B
fully-open AI2 OLMo 2 — research provenance flagship
Gemma 4 26B MoE
Gemma 4 MoE — workstation efficiency variant
DeepSeek Coder V3
workstation coding alternative to Qwen 2.5 Coder
Nemotron 3 Super 49B
32GB-VRAM enterprise deployments
Magistral 32B
research / non-commercial reasoning at 32B scale
Qwen 3 Coder 32B
coding-specialized agent workloads
DeepSeek R1 Distill Qwen 3 32B
workstation reasoning with Qwen 3 base improvements
EXAONE 3.5 32B
Korean / Japanese / CJK workloads
MedGemma 27B
medical-domain fine-tune of Gemma 3 27B
Qwen 3 30B-A3B
workstation MoE — 3B active, 30B total
QwQ 32B Preview
workstation-tier reasoning — Qwen team alternative to R1
Aya Expanse 32B
research / non-commercial multilingual workflows
Qwen 2.5 32B Instruct
workstation-tier multilingual general chat
Jamba 1.5 Mini
workstation long-context with hybrid SSM throughput
Codestral 22B
workstation coding at 22B class
Aya 23 35B
multilingual research at workstation tier
Yi 1.5 34B
workstation-tier multilingual
Command R 35B
workstation-tier RAG-tuned
Mixtral 8x7B Instruct
workstation MoE — 13B active, 47B total
Phind CodeLlama 34B v2
historical reference for Llama 2 coder lineage
Going deeper
- Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
- Execution stacks — recipes that combine models with runtimes + hardware.
- Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
- Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.