The frontier of open-weight model releases
Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.
Filtered results (29)
Models matching your filters. Clear filters by clicking “Any” on each row above, or remove individual filters via the URL.
Llama 4 70B
production self-hosted serving on 2x A100 / H100
EVA Llama 3.3 70B
datacenter-tier creative / narrative generation
Llama 3.3 8B Instruct
consumer-tier chat — drop-in 3.1 8B replacement
Llama 3.1 Nemotron Nano 8B
consumer-tier Nemotron-Llama
Llama 3.3 70B Instruct
production self-hosted serving at the 70B class — when you need general-purpose capability above 32B but don't need frontier-tier
Llama 3.1 Nemotron 70B Instruct
NVIDIA-fine-tuned Llama 3.1 70B
Llama 3.2 90B Vision
datacenter-tier multimodal serving
Llama 3.2 90B Vision Instruct
datacenter vision-language Llama at 70B-class
Llama 3.2 11B Vision
consumer-tier multimodal — Llama-ecosystem migration path for vision workflows
Llama 3.1 70B Instruct
production self-hosted serving at the 70B class
Phind CodeLlama 34B v2
historical reference for Llama 2 coder lineage
Hermes 4 70B FP8
English/multilingual STEM reasoning and structured data extraction
ALIA 40b instruct 2601
Spanish-region enterprise document processing and multilingual Iberian assistant apps
OpenThaiGPT 1.0.0 Beta 13B Chat
Basic Thai-language instruction following and Q&A
Bielik 11B v3.0 Instruct GGUF
Polish-language instruction following and document Q&A
Bielik-11B v3.0 Instruct FP8 Dynamic
Polish-language instruction following and chat on constrained GPU hardware
SOLAR 10.7B v1.0
Foundation for custom fine-tuning pipelines
Saiga Llama3 8B GGUF
Russian conversational assistant on consumer hardware
Cosmos Llama 3 8B Turkish
LLM-jp 4 8B Instruct
Japanese/English bilingual document summarization and extraction
LLM-jp 4 8B Thinking
Internal Japanese-English reasoning and document processing pipelines
Turkish Llama 8B Instruct v0.1
Gervásio 8B PTPT
European Portuguese text tasks where PT-BR is acceptable collateral
Trendyol LLM 7B Chat v0.1
Swallow 7B
Japanese-language fine-tuning base or research
Trendyol LLM 7B Base v0.1
OpenThaiGPT 7B 1.0.0 Chat
Thai-language instruction-following and chat
Salamandra 7B Instruct
Spanish and European multilingual chat prototyping
Salamandra 7B
Spanish/Catalan fine-tuning base for custom NLP pipelines
Going deeper
- Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
- Execution stacks — recipes that combine models with runtimes + hardware.
- Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
- Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.