Frontier zone · Model releases

The frontier of open-weight model releases

Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.

By Fredoline Eruo · Refreshed continuously from catalog seed

Filter

Family

Any Qwen Llama DeepSeek Mistral Gemma Phi GLM OLMo

Deployment

Any Edge Consumer Workstation Datacenter Frontier

Modality

Any Multimodal Text-only

Coverage

Any L1.25 enriched Needs L1.25 Needs benchmark

Filtered results (8)

Models matching your filters. Clear filters by clicking “Any” on each row above, or remove individual filters via the URL.

Qwen 3.6 35B-A3B (MTP)

Alibaba / Qwen team · 2026-05-11

35B/3B-Aworkstation

high-throughput MoE inference at workstation tier

Verdict

Qwen 3.6 27B (MTP)

Alibaba / Qwen team · 2026-05-11

27Bworkstation

dense workstation model with throughput-acceleration

Verdict

Qwen 3 Coder 32B

Alibaba · 2025-11-20

32Bworkstation

coding-specialized agent workloads

Verdict

Qwen 3 32B

Alibaba · 2025-04-29

32Bworkstation

general-purpose reasoning + chat with toggle-style reasoning emission

L1.25 enrichedVerdict

Qwen 3 30B-A3B

Alibaba · 2025-04-29

30Bworkstation

workstation MoE — 3B active, 30B total

Verdict

QwQ 32B Preview

Alibaba · 2024-11-27

32Bworkstation

workstation-tier reasoning — Qwen team alternative to R1

Verdict

Qwen 2.5 Coder 32B Instruct

Alibaba · 2024-11-12

32Bworkstation

single-user autonomous coding agents on RTX 4090 / 5090 / dual-A100 hardware

L1.25 enrichedVerdict

Qwen 2.5 32B Instruct

Alibaba · 2024-09-19

32Bworkstation

workstation-tier multilingual general chat

Verdict

Going deeper

Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
Execution stacks — recipes that combine models with runtimes + hardware.
Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.