RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Frontier
  4. /Models
Frontier zone · Model releases

The frontier of open-weight model releases

Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.

By Fredoline Eruo · Refreshed continuously from catalog seed
Filter
Family
AnyQwenLlamaDeepSeekMistralGemmaPhiGLMOLMo
Deployment
AnyEdgeConsumerWorkstationDatacenterFrontier
Modality
AnyMultimodalText-only
Coverage
AnyL1.25 enrichedNeeds L1.25Needs benchmark
ℹWhat this surface tracks
  • Recent releases — sorted by release date (newest first). Catalog entries lacking release dates are excluded.
  • Reasoning models — DeepSeek R1 family, QwQ, Kimi, Magistral, Qwen 3 reasoning-toggle.
  • Coding models — Qwen Coder, DeepSeek Coder, Codestral, Devstral, OpenCoder, CodeLlama, Yi Coder.
  • Multimodal — Llama Vision, Qwen-VL, Pixtral, Janus, Phi multimodal, MiniCPM-V, Moondream, Gemma 3 IT.
  • MoE — DeepSeek V3/V4, Qwen 3 MoE, Llama 4 Maverick, Mixtral, Hunyuan, Step-3.
  • Edge / phone tier — sub-4B models for embedded / edge / phone deployments.
  • Enrichment gaps — published catalog entries with no L1.25 / verdict / benchmark; the OPERATOR queue.

Recent releases (12 newest)

Catalog entries with the most recent release dates. Use the authority badges to spot which have full editorial coverage (L1.25 enriched + benchmark) and which are catalog-only.

Ring-2.6-1T

InclusionAI / Ant Group · 2026-05-14
1000B/32B-Afrontier

frontier reasoning at MoE serving cost

Verdict

Qwen 3.6 35B-A3B (MTP)

Alibaba / Qwen team · 2026-05-11
35B/3B-Aworkstation

high-throughput MoE inference at workstation tier

Verdict

Qwen 3.6 27B (MTP)

Alibaba / Qwen team · 2026-05-11
27Bworkstation

dense workstation model with throughput-acceleration

Verdict

Qwen 3.5 235B-A17B (MoE)

Alibaba · 2026-05-01
397B/17B-Afrontier

frontier-tier reasoning + multilingual serving on multi-machine clusters

L1.25 enrichedVerdict

Mistral Medium 3.5 (675B MoE)

Mistral AI · 2026-04-29
675B/41B-Afrontier

frontier MoE — Mistral's response to the open MoE wave

Verdict

Mistral Medium 3 24B (dense)

Mistral AI · 2026-04-29
24Bconsumer

research / non-commercial workstation deployments

Verdict

DeepSeek V4 Pro (1.6T MoE)

DeepSeek · 2026-04-24
1600B/49B-Afrontier

frontier-tier coding + reasoning serving — currently the open-weight ceiling

L1.25 enrichedVerdict

DeepSeek V4 Flash (284B MoE)

DeepSeek · 2026-04-24
284B/13B-Adatacenter

datacenter MoE — V4 efficiency variant

Verdict

OLMo 2 32B

AI2 (Allen AI) · 2026-04-12
32Bworkstation

fully-open AI2 OLMo 2 — research provenance flagship

Verdict

Phi-4 Reasoning Mini 4B

Microsoft · 2026-04-08
3.8Bedge

edge-tier reasoning

Verdict

Llama 4 Maverick

Meta · 2026-04-05
400Bfrontier

frontier-tier multimodal serving on multi-machine clusters

L1.25 enrichedVerdictMultimodal

Llama 4 Scout

Meta · 2026-04-05
109Bdatacenter

production multimodal serving — image + text at workstation-cluster scale

L1.25 enrichedVerdict

New reasoning models

Models with explicit thinking-block emission — DeepSeek R1 family, QwQ, Kimi, Magistral, Qwen 3 reasoning-mode. /stacks/local-reasoning-model for the canonical deployment recipe.

Kimi K2.6

Moonshot AI · 2026-03-10
1000Bfrontier

Moonshot frontier MoE — long-context specialist

Verdict

Magistral 32B

Mistral AI · 2025-12-15
32Bworkstation

research / non-commercial reasoning at 32B scale

Verdict

Kimi K1.5

Moonshot AI · 2025-12-01
200Bdatacenter

deep math + reasoning research

Verdict

Qwen 3 Coder 32B

Alibaba · 2025-11-20
32Bworkstation

coding-specialized agent workloads

Verdict

DeepSeek R1 Distill Qwen 3 32B

DeepSeek AI · 2025-11-15
32Bworkstation

workstation reasoning with Qwen 3 base improvements

Verdict

Qwen 3 235B-A22B

Alibaba · 2025-04-29
235Bfrontier

Qwen 3 MoE flagship — pre-3.5 baseline

Verdict

New coding models

Coding-specialized fine-tunes. The Qwen Coder lineage is the current open-weight benchmark leader; DeepSeek Coder V3, Codestral, Devstral, OpenCoder are the credible alternatives. /stacks/local-coding-agent for the canonical deployment recipe.

DeepSeek Coder V3

DeepSeek AI · 2026-02-08
33Bworkstation

workstation coding alternative to Qwen 2.5 Coder

Verdict

Devstral Small 2 24B

Mistral AI · 2025-09-25
24Bconsumer

Apache 2.0 coding alternative to Qwen 2.5 Coder

Verdict

Yi Coder 9B

01.AI · 2025-09-20
9Bconsumer

8GB-VRAM coding

Verdict

Qwen 2.5 Coder 32B Instruct

Alibaba · 2024-11-12
32Bworkstation

single-user autonomous coding agents on RTX 4090 / 5090 / dual-A100 hardware

L1.25 enrichedVerdict

Qwen 2.5 Coder 14B Instruct

Alibaba · 2024-11-12
14Bconsumer

16GB-VRAM coding

BenchmarkVerdict

Qwen 2.5 Coder 7B Instruct

Alibaba · 2024-11-12
7Bconsumer

consumer-tier coding at 8GB VRAM

Verdict

New multimodal models

Vision-language models. The 2025-2026 wave: Llama 4 Scout / Maverick, Qwen 2.5-VL, Pixtral, Janus-Pro, Phi-4 Multimodal. /stacks/local-vision-model for the canonical deployment recipe.

Llama 4 Maverick

Meta · 2026-04-05
400Bfrontier

frontier-tier multimodal serving on multi-machine clusters

L1.25 enrichedVerdictMultimodal

Gemma 4 31B Dense

Google · 2026-04-02
31Bworkstation

workstation-tier multilingual chat with permissive license

L1.25 enrichedVerdictMultimodal

Gemma 4 26B MoE

Google · 2026-04-02
26Bworkstation

Gemma 4 MoE — workstation efficiency variant

VerdictMultimodal

Gemma 4 E4B (Effective 4B)

Google · 2026-04-02
4Bedge

edge-tier Gemma 4 — laptop friendly

BenchmarkVerdictMultimodal

Gemma 4 E2B (Effective 2B)

Google · 2026-04-02
2Bedge

phone-tier Gemma 4

BenchmarkVerdictMultimodal

Phi-4 Multimodal

Microsoft · 2026-02-25
14Bconsumer

16GB-consumer multimodal Q&A

VerdictMultimodal

New MoE models

Mixture-of-Experts releases. Active-parameter efficiency shapes the deployment economics. See /systems/distributed-inference for the architectural depth.

Ring-2.6-1T

InclusionAI / Ant Group · 2026-05-14
1000B/32B-Afrontier

frontier reasoning at MoE serving cost

Verdict

Qwen 3.6 35B-A3B (MTP)

Alibaba / Qwen team · 2026-05-11
35B/3B-Aworkstation

high-throughput MoE inference at workstation tier

Verdict

Qwen 3.5 235B-A17B (MoE)

Alibaba · 2026-05-01
397B/17B-Afrontier

frontier-tier reasoning + multilingual serving on multi-machine clusters

L1.25 enrichedVerdict

Mistral Medium 3.5 (675B MoE)

Mistral AI · 2026-04-29
675B/41B-Afrontier

frontier MoE — Mistral's response to the open MoE wave

Verdict

DeepSeek V4 Pro (1.6T MoE)

DeepSeek · 2026-04-24
1600B/49B-Afrontier

frontier-tier coding + reasoning serving — currently the open-weight ceiling

L1.25 enrichedVerdict

DeepSeek V4 Flash (284B MoE)

DeepSeek · 2026-04-24
284B/13B-Adatacenter

datacenter MoE — V4 efficiency variant

Verdict

New edge / phone-tier models

Sub-4B models for phone / Pi / embedded deployment. Phi-4 Mini, Gemma 3 1B, MiniCPM 3 4B, SmolLM 3, Hermes 3 3B, Dolphin 3 3B, RWKV 7 Goose 1.5B.

Phi-4 Reasoning Mini 4B

Microsoft · 2026-04-08
3.8Bedge

edge-tier reasoning

Verdict

Gemma 4 E4B (Effective 4B)

Google · 2026-04-02
4Bedge

edge-tier Gemma 4 — laptop friendly

BenchmarkVerdictMultimodal

Gemma 4 E2B (Effective 2B)

Google · 2026-04-02
2Bedge

phone-tier Gemma 4

BenchmarkVerdictMultimodal

Phi-4 Mini 4B

Microsoft · 2026-02-25
3.8Bedge

edge / embedded reasoning

Verdict

SmolLM 3 3B

HuggingFace · 2025-11-04
3Bedge

edge-tier reasoning

Verdict

Qwen 3 4B

Alibaba · 2025-04-29
4Bedge

edge-tier Qwen 3 — Apple Silicon laptop friendly

BenchmarkVerdict

Enrichment gaps — OPERATOR queue

High-relevance catalog entries (7B-100B) that lack L1.25 enrichment, verdict, AND benchmark. These render noindex today — the next sprint's editorial queue. Surfacing them here keeps the gap visible.

Turkish Gemma 9B T1

ytu-ce-cosmos
9B

Turkish Llama 8B Instruct v0.1

ytu-ce-cosmos
8B

Trendyol LLM 7B Chat v0.1

Trendyol
7B

Omni 31B Turkish Reasoning

community
31B

Cosmos Llama 3 8B Turkish

ytu-ce-cosmos
8B

Gemma 4 Turkish 26B (4B active)

esokullu
26B

Turkish Mistral 7B Instruct v0.2

ytu-ce-cosmos
7B

Mihenk LLM v2 35B (Turkish Financial)

emircansevdi
35B

Going deeper

  • Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
  • Execution stacks — recipes that combine models with runtimes + hardware.
  • Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
  • Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.