BLK · MODEL RELEASE TRACKER

What just dropped.

Newest local AI models, date-sorted. Every row carries quick fit verdicts for the four VRAM classes operators ask about — so you know in one glance whether to bother downloading a model before it starts loading. 60 models indexed.

For broader ecosystem news see /pulse. For the recommendation engine see /choose-my-gpu.

60 models shown · newest first

Added	Model	Params	Modality	8GB	16GB	24GB	48GB	96GB+
2d ago	Kimi K2.7-Code moonshot · released 2026-06-12	1000B	TEXT	✗	✗	✗	✗	✗
2d ago	VibeThinker-3B qwen · released 2026-06-12	3B	TEXT	✓	✓	✓	✓	✓
2d ago	Nemotron 3 Ultra (550B-A55B) other · released 2026-06-04	550B	TEXT	✗	✗	✗	✗	✗
2d ago	MiniMax-M3 other · released 2026-06-02	428B	TEXT	✗	✗	✗	✗	✗
2d ago	GLM-5.2 glm · released 2026-06-16	753B	TEXT	✗	✗	✗	✗	✗
1mo ago	paraphrase-multilingual-MiniLM-L12-v2 other	118M	TEXT	✓	✓	✓	✓	✓
1mo ago	all-mpnet-base-v2 other	109M	TEXT	✓	✓	✓	✓	✓
1mo ago	all-MiniLM-L6-v2 other	22M	TEXT	✓	✓	✓	✓	✓
1mo ago	Command R7B (12-2024) command-r	8B	TEXT	~	✓	✓	✓	✓
1mo ago	OpenELM 3B Instruct other	3B	TEXT	✓	✓	✓	✓	✓
1mo ago	SmolVLM Instruct other	2.25B	VLM	✓	✓	✓	✓	✓
1mo ago	DeepSeek V2 Lite Chat deepseek	15.7B	TEXT	✗	✓	✓	✓	✓
1mo ago	OLMo 2 1B Instruct olmo	1B	TEXT	✓	✓	✓	✓	✓
1mo ago	Falcon 3 3B Instruct falcon	3B	TEXT	✓	✓	✓	✓	✓
1mo ago	Granite 3.1 2B Instruct granite	2B	TEXT	✓	✓	✓	✓	✓
1mo ago	Qwen2-VL 2B Instruct qwen	2B	VLM	✓	✓	✓	✓	✓
1mo ago	TinyLlama 1.1B Chat v1.0 llama	1.1B	TEXT	✓	✓	✓	✓	✓
1mo ago	SmolLM2 360M Instruct other	360M	TEXT	✓	✓	✓	✓	✓
1mo ago	SmolLM2 135M Instruct other	135M	TEXT	✓	✓	✓	✓	✓
1mo ago	Gemma 2 2B Instruct gemma	2B	TEXT	✓	✓	✓	✓	✓
1mo ago	Gemma 3 270M gemma	270M	TEXT	✓	✓	✓	✓	✓
1mo ago	Qwen 3 1.7B qwen	1.7B	TEXT	✓	✓	✓	✓	✓
1mo ago	Qwen 3 0.6B qwen	600M	TEXT	✓	✓	✓	✓	✓
1mo ago	GOT-OCR 2.0 stepfun	580M	VLM	✓	✓	✓	✓	✓
1mo ago	Florence-2 Large other	770M	VLM	✓	✓	✓	✓	✓
1mo ago	ColPali v1.3 gemma	3B	VLM	✓	✓	✓	✓	✓
1mo ago	SigLIP SO400M (patch14-384) other	428M	VLM	✓	✓	✓	✓	✓
1mo ago	Stable Diffusion 3.5 Medium other	2.5B	TEXT	✓	✓	✓	✓	✓
1mo ago	SDXL Turbo other	2.6B	TEXT	✓	✓	✓	✓	✓
1mo ago	FLUX.1 [schnell] other	12B	TEXT	△	✓	✓	✓	✓
1mo ago	FLUX.1 [dev] other	12B	TEXT	△	✓	✓	✓	✓
1mo ago	mxbai-rerank-large-v2 other	1.54B	TEXT	✓	✓	✓	✓	✓
1mo ago	Jina Reranker v2 Base Multilingual other	278M	TEXT	✓	✓	✓	✓	✓
1mo ago	Multilingual E5 Large Instruct other	560M	TEXT	✓	✓	✓	✓	✓
1mo ago	E5 Mistral 7B Instruct other	7.11B	TEXT	~	✓	✓	✓	✓
1mo ago	GTE ModernBERT Base other	149M	TEXT	✓	✓	✓	✓	✓
1mo ago	Snowflake Arctic Embed L v2.0 other	568M	TEXT	✓	✓	✓	✓	✓
1mo ago	Jina Embeddings v3 other	572M	TEXT	✓	✓	✓	✓	✓
1mo ago	mxbai-embed-large-v1 other	335M	TEXT	✓	✓	✓	✓	✓
1mo ago	BGE Large EN v1.5 other	335M	TEXT	✓	✓	✓	✓	✓
1mo ago	Nomic Embed Text v1.5 other	137M	TEXT	✓	✓	✓	✓	✓
1mo ago	Piper other	25M	AUDIO	✓	✓	✓	✓	✓
1mo ago	Orpheus 3B 0.1 FT other	3B	AUDIO	✓	✓	✓	✓	✓
1mo ago	F5-TTS other	336M	AUDIO	✓	✓	✓	✓	✓
1mo ago	XTTS v2 other	460M	AUDIO	✓	✓	✓	✓	✓
1mo ago	Kokoro 82M other	82M	AUDIO	✓	✓	✓	✓	✓
1mo ago	Parakeet TDT 0.6B v2 other	600M	AUDIO	✓	✓	✓	✓	✓
1mo ago	Distil-Whisper Large v3 other	756M	AUDIO	✓	✓	✓	✓	✓
1mo ago	Whisper Small other	244M	AUDIO	✓	✓	✓	✓	✓
1mo ago	Whisper Base other	74M	AUDIO	✓	✓	✓	✓	✓
1mo ago	Whisper Tiny other	39M	AUDIO	✓	✓	✓	✓	✓
1mo ago	Sarvam 30B other	30B	TEXT	✗	△	~	✓	✓
1mo ago	Typhoon S ThaiLLM 8B Instruct Research Preview other	8B	TEXT	~	✓	✓	✓	✓
1mo ago	OpenThaiGPT 1.5 7B Instruct other	7B	TEXT	✓	✓	✓	✓	✓
1mo ago	PhoGPT 4B other	3.7B	TEXT	✓	✓	✓	✓	✓
1mo ago	PhoGPT 4B Chat other	3.7B	TEXT	✓	✓	✓	✓	✓
1mo ago	Vikhr Qwen 2.5 0.5B Instruct other	500M	TEXT	✓	✓	✓	✓	✓
1mo ago	Dostoevsky Doesn't Write It GPT2 other	175M	TEXT	✓	✓	✓	✓	✓
1mo ago	mGPT 13B other	13B	TEXT	△	✓	✓	✓	✓
1mo ago	mGPT 1.3B Mongol other	1.3B	TEXT	✓	✓	✓	✓	✓

✓

Comfortable

Q4 fits with KV headroom

Tight

Q4 fits, small context

△

Marginal

IQ3 only, expect degradation

✗

Doesn't fit

Won't run usefully

HOW FIT VERDICTS ARE DERIVED

Quick footprint estimate at Q4_K_M: params × 0.6 GB + 1.5 GB runtime overhead. Comfortable means the rig has ≥1.4× headroom for KV cache and multi-turn context. Tight means it fits but you'll bump the ceiling on long context. Marginal means only aggressive (IQ3 / IQ2) quants work, with quality degradation. Doesn't fit means weights alone won't load without RAM offload, which crushes tok/s. Frontier-class models (400B+) render as no-fit on every single-rig VRAM class — they need multi-GPU or cloud.

> What just dropped.

What just dropped.