The frontier of open-weight model releases
Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.
Filtered results (5)
Models matching your filters. Clear filters by clicking “Any” on each row above, or remove individual filters via the URL.
Llama 4 Scout
production multimodal serving — image + text at workstation-cluster scale
Llama 3.3 70B Instruct
production self-hosted serving at the 70B class — when you need general-purpose capability above 32B but don't need frontier-tier
Salamandra 2B
Fine-tuning base for Spanish or Catalan/Galician/Basque NLP tasks
Salamandra 2B Instruct
Spanish and European multilingual instruction following on low-VRAM hardware
TinyLlama 1.1B Chat v1.0
Reproducible SLM research baseline and legacy llama.cpp deployments
Going deeper
- Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
- Execution stacks — recipes that combine models with runtimes + hardware.
- Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
- Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.