->Will it run?Best GPU Compare Troubleshoot Start Learn Pulse Models Hardware Tools Bench

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo

DIR

Models
Hardware
Tools
Benchmarks

TOOLS

Will it run?
Compare hardware
Cost vs cloud
Choose my GPU
Prompting kits
Quick answers

REF

All buyer guides
Learn local AI
Methodology
Glossary
Errors KB
Trust

EDITOR

About
Author
How we make money
Editorial policy
Contact

LEGAL

Privacy
Terms
Sitemap

MAIL · MONTHLY DIGEST

Get monthly local AI changes

Monthly recap. No spam.

Email address

DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated

RUNLOCALAI · v38

>
Home
Frontier
Models

Frontier zone · Model releases

The frontier of open-weight model releases

Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.

By Fredoline Eruo · Refreshed continuously from catalog seed

Filter

Family

Any Qwen Llama DeepSeek Mistral Gemma Phi GLM OLMo

Deployment

Any Edge Consumer Workstation Datacenter Frontier

Modality

Any Multimodal Text-only

Coverage

Any L1.25 enriched Needs L1.25 Needs benchmark

Filtered results (46)

Models matching your filters. Clear filters by clicking “Any” on each row above, or remove individual filters via the URL.

Mistral Medium 3 24B (dense)

Mistral AI · 2026-04-29

research / non-commercial workstation deployments

Granite 3.3 8B

IBM · 2026-03-12

enterprise tool-calling on IBM stacks

Mistral Small 3.2 24B

Mistral AI · 2026-03-08

consumer-tier multilingual instruction-following

Nemotron 3 Nano 9B

NVIDIA · 2026-01-22

NVIDIA-stack tool-calling agents

Nemotron 3 Nano (30B-A3B)

NVIDIA · 2026-01-15

NVIDIA-tuned consumer-tier general

DeepSeek V3 Lite (16B MoE)

DeepSeek AI · 2026-01-10

16B/2.4B-Aconsumer

consumer-tier MoE inference

EXAONE 3.5 8B

LG AI Research · 2025-11-10

consumer-tier Korean workloads

InternLM 3 8B

Shanghai AI Lab · 2025-10-05

Chinese-language consumer workloads

Devstral Small 2 24B

Mistral AI · 2025-09-25

Apache 2.0 coding alternative to Qwen 2.5 Coder

Yi Coder 9B

01.AI · 2025-09-20

8GB-VRAM coding

Qwen 3 7B

Alibaba · 2025-09-15

consumer-tier reasoning on 8GB+ GPUs

Qwen 3 Embedding 8B

Alibaba · 2025-06-05

permissively-licensed embeddings at 8B

Qwen 3 8B

Alibaba · 2025-04-29

consumer-tier reasoning toggle

Granite 3 MoE (3B active)

IBM · 2025-04-15

16B/3B-Aconsumer

consumer-tier enterprise MoE

Llama 3.3 8B Instruct

Meta · 2025-04-12

consumer-tier chat — drop-in 3.1 8B replacement

Llama 3.1 Nemotron Nano 8B

NVIDIA · 2025-04-08

consumer-tier Nemotron-Llama

DeepSeek R1 Distill Mistral 24B

Community (DeepSeek-derived) · 2025-03-18

consumer-tier reasoning with Mistral instruction lineage

Granite 3.2 8B

IBM · 2025-02-25

enterprise tool-calling on IBM stacks

Mistral Saba 24B

Mistral AI · 2025-02-17

Arabic / South-Asian multilingual

Mistral Small 3 24B

Mistral AI · 2025-01-30

consumer-tier multilingual instruction-following — Mistral's instruction-tuned baseline at 24B

L1.25 enrichedVerdict

Dolphin 3.0 Mistral 24B

Cognitive Computations · 2025-01-30

consumer-tier creative / less-restricted generation

DeepSeek R1 Distill Qwen 14B

DeepSeek · 2025-01-20

consumer-tier reasoning at 14B

DeepSeek R1 Distill Llama 8B

DeepSeek AI · 2025-01-20

consumer-tier reasoning on 8GB+ GPUs

Falcon 3 10B

TII (Abu Dhabi) · 2024-12-17

Arabic-language workloads

Falcon 3 7B Instruct

TII (UAE) · 2024-12-17

consumer-tier multilingual

Phi-4 14B

Microsoft · 2024-12-12

16 GB VRAM tier reasoning + chat — the right pick when 32B-class doesn't fit

L1.25 enrichedVerdict

OLMo 2 13B

AI2 (Allen Institute) · 2024-11-26

reproducibility / academic research

Tulu 3 8B

Allen Institute (AI2) · 2024-11-21

fully-open instruction-following research baseline

Qwen 2.5 Coder 7B Instruct

Alibaba · 2024-11-12

consumer-tier coding at 8GB VRAM

OpenCoder 8B

INFLY AI · 2024-11-09

academic / reproducibility-sensitive coding research

Baichuan 4 13B

Baichuan AI · 2024-10-30

Chinese-language consumer workloads — alternative to GLM

Granite 3.0 8B Instruct

IBM · 2024-10-21

enterprise-friendly Apache 2.0 baseline

Ministral 8B Instruct

Mistral AI · 2024-10-16

consumer-tier long-context — research only

Qwen 2.5 14B Instruct

Alibaba · 2024-09-19

16GB-VRAM general chat with multilingual depth

Qwen 2.5 Math 7B

Alibaba · 2024-09-19

consumer-tier math problem solving

NV-Embed v2

NVIDIA · 2024-09-09

research-grade embeddings

Falcon Mamba 7B

TII (Abu Dhabi) · 2024-08-12

long-context inference where memory matters

Codestral Mamba 7B

Mistral AI · 2024-07-16

long-context coding workloads where memory matters

InternLM 2.5 7B Chat

Shanghai AI Lab · 2024-07-03

permissively-licensed long-context chat

GLM-4 9B

Zhipu AI · 2024-06-15

Chinese tool-calling agents

Aya 23 8B

Cohere For AI · 2024-05-23

multilingual research at consumer tier

CodeQwen 1.5 7B

Alibaba · 2024-04-16

historical reference — Qwen 2.5 Coder 7B is the modern pick

Stable LM 2 12B

Stability AI · 2024-04-08

12B-class deployments tolerating Stability membership

StarCoder 2 15B

BigCode · 2024-02-28

permissively-licensed coding at 16GB-VRAM

StarCoder 2 7B

BigCode · 2024-02-28

consumer-tier code completion at 8GB

DeepSeek MoE 16B Base

DeepSeek AI · 2024-01-15

16B/2.4B-Aconsumer

research / lineage reference

Going deeper

Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
Execution stacks — recipes that combine models with runtimes + hardware.
Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.