Frontier zone · Model releases

The frontier of open-weight model releases

Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.

By Fredoline Eruo · Refreshed continuously from catalog seed

Filter

Family

Any Qwen Llama DeepSeek Mistral Gemma Phi GLM OLMo

Deployment

Any Edge Consumer Workstation Datacenter Frontier

Modality

Any Multimodal Text-only

Coverage

Any L1.25 enriched Needs L1.25 Needs benchmark

Filtered results (9)

Models matching your filters. Clear filters by clicking “Any” on each row above, or remove individual filters via the URL.

Qwen 3.5 235B-A17B (MoE)

Alibaba · 2026-05-01

397B/17B-Afrontier

frontier-tier reasoning + multilingual serving on multi-machine clusters

L1.25 enrichedVerdict

Qwen 3 32B

Alibaba · 2025-04-29

32Bworkstation

general-purpose reasoning + chat with toggle-style reasoning emission

L1.25 enrichedVerdict

Qwen 3 14B

Alibaba · 2025-04-29

14Bconsumer

16GB-VRAM reasoning workloads with thinking-mode toggle

L1.25 enrichedBenchmarkVerdict

Qwen 2.5 Coder 32B Instruct

Alibaba · 2024-11-12

32Bworkstation

single-user autonomous coding agents on RTX 4090 / 5090 / dual-A100 hardware

L1.25 enrichedVerdict

Qwen2-VL 2B Instruct

Alibaba

Lightweight document and chart understanding on a consumer GPU

L1.25 enrichedVerdict

Qwen 3.5 2B Turkish SFT

Tuguberk

L1.25 enriched

Qwen 3 1.7B

Alibaba

1.7B

Edge laptop assistant with reasoning that fits in 2GB VRAM

L1.25 enrichedVerdict

Qwen3 0.6B Hindi Instruct v1 GGUF

pankajpandey-dev

0.6B

Simple Hindi instruction following on CPU-only devices

L1.25 enrichedVerdict

Qwen 3 0.6B

Alibaba

0.6B

Sub-1B on-device chat and tool-calling agent on phones

L1.25 enrichedVerdict

Going deeper

Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
Execution stacks — recipes that combine models with runtimes + hardware.
Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.