The frontier of open-weight model releases
Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.
Filtered results (20)
Models matching your filters. Clear filters by clicking “Any” on each row above, or remove individual filters via the URL.
Gemma 4 31B Dense
workstation-tier multilingual chat with permissive license
Gemma 4 26B MoE
Gemma 4 MoE — workstation efficiency variant
Gemma 4 E4B (Effective 4B)
edge-tier Gemma 4 — laptop friendly
Gemma 4 E2B (Effective 2B)
phone-tier Gemma 4
MedGemma 27B
medical-domain fine-tune of Gemma 3 27B
Gemma 3 27B
Google's open-weight workstation-tier multilingual flagship — pre-Gemma-4 baseline
Gemma 3 12B
consumer-tier multilingual chat with vision support in 'it' variant
Gemma 3 4B
edge-tier chat — Apple Silicon laptop friendly
Gemma 3 1B
phone-tier Gemma — smallest practical Gemma 3
PaliGemma 2 10B
VLM fine-tuning at 24GB VRAM
PaliGemma 2 3B
task-specific VLM fine-tuning base
Gemma 2 9B Instruct
consumer-tier Gemma — pre-Gemma-3 baseline
CodeGemma 7B
Gemma-derived coding model
Gemma 4 Turkish 26B (4B active)
Trendyol LLM Asure 12B
Turkish business workflow assistants
YTU Turkish Gemma 9B v0.1
Turkish instruction following on 16GB GPUs
Turkish Gemma 9B T1
ColPali v1.3
Visual-document retrieval for multi-page PDFs with charts, tables, and scans where OCR pipelines fail
Gemma 2 2B Instruct
Consumer-GPU local chat with strong safety defaults
Gemma 3 270M
Fine-tuning base for sub-1W on-device classifiers and routers
Going deeper
- Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
- Execution stacks — recipes that combine models with runtimes + hardware.
- Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
- Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.