->Will it run?Best GPU Compare Troubleshoot Start Learn Pulse Models Hardware Tools Bench

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo

DIR

Models
Hardware
Tools
Benchmarks

TOOLS

Will it run?
Compare hardware
Cost vs cloud
Choose my GPU
Prompting kits
Quick answers

REF

All buyer guides
Learn local AI
Methodology
Glossary
Errors KB
Trust

EDITOR

About
Author
How we make money
Editorial policy
Contact

LEGAL

Privacy
Terms
Sitemap

MAIL · MONTHLY DIGEST

Get monthly local AI changes

Monthly recap. No spam.

Email address

DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated

RUNLOCALAI · v38

>
Home
Frontier
Models

Frontier zone · Model releases

The frontier of open-weight model releases

Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.

By Fredoline Eruo · Refreshed continuously from catalog seed

Filter

Family

Any Qwen Llama DeepSeek Mistral Gemma Phi GLM OLMo

Deployment

Any Edge Consumer Workstation Datacenter Frontier

Modality

Any Multimodal Text-only

Coverage

Any L1.25 enriched Needs L1.25 Needs benchmark

ℹWhat this surface tracks

Recent releases — sorted by release date (newest first). Catalog entries lacking release dates are excluded.
Reasoning models — DeepSeek R1 family, QwQ, Kimi, Magistral, Qwen 3 reasoning-toggle.
Coding models — Qwen Coder, DeepSeek Coder, Codestral, Devstral, OpenCoder, CodeLlama, Yi Coder.
Multimodal — Llama Vision, Qwen-VL, Pixtral, Janus, Phi multimodal, MiniCPM-V, Moondream, Gemma 3 IT.
MoE — DeepSeek V3/V4, Qwen 3 MoE, Llama 4 Maverick, Mixtral, Hunyuan, Step-3.
Edge / phone tier — sub-4B models for embedded / edge / phone deployments.
Enrichment gaps — published catalog entries with no L1.25 / verdict / benchmark; the OPERATOR queue.

Recent releases (12 newest)

Catalog entries with the most recent release dates. Use the authority badges to spot which have full editorial coverage (L1.25 enriched + benchmark) and which are catalog-only.

Ring-2.6-1T

InclusionAI / Ant Group · 2026-05-14

1000B/32B-Afrontier

frontier reasoning at MoE serving cost

Qwen 3.6 35B-A3B (MTP)

Alibaba / Qwen team · 2026-05-11

35B/3B-Aworkstation

high-throughput MoE inference at workstation tier

Qwen 3.6 27B (MTP)

Alibaba / Qwen team · 2026-05-11

dense workstation model with throughput-acceleration

Qwen 3.5 235B-A17B (MoE)

Alibaba · 2026-05-01

397B/17B-Afrontier

frontier-tier reasoning + multilingual serving on multi-machine clusters

L1.25 enrichedVerdict

Mistral Medium 3.5 (675B MoE)

Mistral AI · 2026-04-29

675B/41B-Afrontier

frontier MoE — Mistral's response to the open MoE wave

Mistral Medium 3 24B (dense)

Mistral AI · 2026-04-29

research / non-commercial workstation deployments

DeepSeek V4 Pro (1.6T MoE)

DeepSeek · 2026-04-24

1600B/49B-Afrontier

frontier-tier coding + reasoning serving — currently the open-weight ceiling

L1.25 enrichedVerdict

DeepSeek V4 Flash (284B MoE)

DeepSeek · 2026-04-24

284B/13B-Adatacenter

datacenter MoE — V4 efficiency variant

OLMo 2 32B

AI2 (Allen AI) · 2026-04-12

fully-open AI2 OLMo 2 — research provenance flagship

Phi-4 Reasoning Mini 4B

Microsoft · 2026-04-08

edge-tier reasoning

Llama 4 Maverick

Meta · 2026-04-05

frontier-tier multimodal serving on multi-machine clusters

L1.25 enrichedVerdictMultimodal

Llama 4 Scout

Meta · 2026-04-05

production multimodal serving — image + text at workstation-cluster scale

L1.25 enrichedVerdict

New reasoning models

Models with explicit thinking-block emission — DeepSeek R1 family, QwQ, Kimi, Magistral, Qwen 3 reasoning-mode. /stacks/local-reasoning-model for the canonical deployment recipe.

Kimi K2.6

Moonshot AI · 2026-03-10

Moonshot frontier MoE — long-context specialist

Magistral 32B

Mistral AI · 2025-12-15

research / non-commercial reasoning at 32B scale

Kimi K1.5

Moonshot AI · 2025-12-01

deep math + reasoning research

Qwen 3 Coder 32B

Alibaba · 2025-11-20

coding-specialized agent workloads

DeepSeek R1 Distill Qwen 3 32B

DeepSeek AI · 2025-11-15

workstation reasoning with Qwen 3 base improvements

Qwen 3 235B-A22B

Alibaba · 2025-04-29

Qwen 3 MoE flagship — pre-3.5 baseline

New coding models

Coding-specialized fine-tunes. The Qwen Coder lineage is the current open-weight benchmark leader; DeepSeek Coder V3, Codestral, Devstral, OpenCoder are the credible alternatives. /stacks/local-coding-agent for the canonical deployment recipe.

DeepSeek Coder V3

DeepSeek AI · 2026-02-08

workstation coding alternative to Qwen 2.5 Coder

Devstral Small 2 24B

Mistral AI · 2025-09-25

Apache 2.0 coding alternative to Qwen 2.5 Coder

Yi Coder 9B

01.AI · 2025-09-20

8GB-VRAM coding

Qwen 2.5 Coder 32B Instruct

Alibaba · 2024-11-12

single-user autonomous coding agents on RTX 4090 / 5090 / dual-A100 hardware

L1.25 enrichedVerdict

Qwen 2.5 Coder 14B Instruct

Alibaba · 2024-11-12

16GB-VRAM coding

BenchmarkVerdict

Qwen 2.5 Coder 7B Instruct

Alibaba · 2024-11-12

consumer-tier coding at 8GB VRAM

New multimodal models

Vision-language models. The 2025-2026 wave: Llama 4 Scout / Maverick, Qwen 2.5-VL, Pixtral, Janus-Pro, Phi-4 Multimodal. /stacks/local-vision-model for the canonical deployment recipe.

Llama 4 Maverick

Meta · 2026-04-05

frontier-tier multimodal serving on multi-machine clusters

L1.25 enrichedVerdictMultimodal

Gemma 4 31B Dense

Google · 2026-04-02

workstation-tier multilingual chat with permissive license

L1.25 enrichedVerdictMultimodal

Gemma 4 26B MoE

Google · 2026-04-02

Gemma 4 MoE — workstation efficiency variant

VerdictMultimodal

Gemma 4 E4B (Effective 4B)

Google · 2026-04-02

edge-tier Gemma 4 — laptop friendly

BenchmarkVerdictMultimodal

Gemma 4 E2B (Effective 2B)

Google · 2026-04-02

phone-tier Gemma 4

BenchmarkVerdictMultimodal

Phi-4 Multimodal

Microsoft · 2026-02-25

16GB-consumer multimodal Q&A

VerdictMultimodal

New MoE models

Mixture-of-Experts releases. Active-parameter efficiency shapes the deployment economics. See /systems/distributed-inference for the architectural depth.

Ring-2.6-1T

InclusionAI / Ant Group · 2026-05-14

1000B/32B-Afrontier

frontier reasoning at MoE serving cost

Qwen 3.6 35B-A3B (MTP)

Alibaba / Qwen team · 2026-05-11

35B/3B-Aworkstation

high-throughput MoE inference at workstation tier

Qwen 3.5 235B-A17B (MoE)

Alibaba · 2026-05-01

397B/17B-Afrontier

frontier-tier reasoning + multilingual serving on multi-machine clusters

L1.25 enrichedVerdict

Mistral Medium 3.5 (675B MoE)

Mistral AI · 2026-04-29

675B/41B-Afrontier

frontier MoE — Mistral's response to the open MoE wave

DeepSeek V4 Pro (1.6T MoE)

DeepSeek · 2026-04-24

1600B/49B-Afrontier

frontier-tier coding + reasoning serving — currently the open-weight ceiling

L1.25 enrichedVerdict

DeepSeek V4 Flash (284B MoE)

DeepSeek · 2026-04-24

284B/13B-Adatacenter

datacenter MoE — V4 efficiency variant

New edge / phone-tier models

Sub-4B models for phone / Pi / embedded deployment. Phi-4 Mini, Gemma 3 1B, MiniCPM 3 4B, SmolLM 3, Hermes 3 3B, Dolphin 3 3B, RWKV 7 Goose 1.5B.

Phi-4 Reasoning Mini 4B

Microsoft · 2026-04-08

edge-tier reasoning

Gemma 4 E4B (Effective 4B)

Google · 2026-04-02

edge-tier Gemma 4 — laptop friendly

BenchmarkVerdictMultimodal

Gemma 4 E2B (Effective 2B)

Google · 2026-04-02

phone-tier Gemma 4

BenchmarkVerdictMultimodal

Phi-4 Mini 4B

Microsoft · 2026-02-25

edge / embedded reasoning

SmolLM 3 3B

HuggingFace · 2025-11-04

edge-tier reasoning

Qwen 3 4B

Alibaba · 2025-04-29

edge-tier Qwen 3 — Apple Silicon laptop friendly

BenchmarkVerdict

Enrichment gaps — OPERATOR queue

High-relevance catalog entries (7B-100B) that lack L1.25 enrichment, verdict, AND benchmark. These render noindex today — the next sprint's editorial queue. Surfacing them here keeps the gap visible.

Turkish Gemma 9B T1

Turkish Llama 8B Instruct v0.1

Trendyol LLM 7B Chat v0.1

Omni 31B Turkish Reasoning

Cosmos Llama 3 8B Turkish

Gemma 4 Turkish 26B (4B active)

Turkish Mistral 7B Instruct v0.2

Mihenk LLM v2 35B (Turkish Financial)

Going deeper

Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
Execution stacks — recipes that combine models with runtimes + hardware.
Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.