Kumru 2B
Kumru 2B is a compact Turkish text-generation model from VNGRS. The Hugging Face config reports a Mistral-family architecture with an 8K context window, and the public Ollama build makes it a practical edge-speed Turkish model.
Overview
Kumru 2B is a compact Turkish text-generation model from VNGRS. The Hugging Face config reports a Mistral-family architecture with an 8K context window, and the public Ollama build makes it a practical edge-speed Turkish model.
Strengths
- Very high local throughput on consumer GPUs
- Apache-2.0 license with commercial use allowed
- Small enough for low-VRAM Turkish assistants
Weaknesses
- 2B-class quality ceiling on complex reasoning
- Shorter context than current 32K Turkish Mistral derivatives
- Best suited to short Turkish interactions, not deep research
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 1.5 GB | 2 GB |
Get the model
Ollama
One-line install
ollama run alibayram/kumru:latestRead our Ollama review →HuggingFace
Original weights
Source repository — direct quantization required.
Benchmarks
Real measurements on real hardware. Numbers ship with the runner version, quant, and date.
| Hardware | Provenance | Quant | Ctx | Tokens / sec | VRAM | TTFT | Date |
|---|---|---|---|---|---|---|---|
| NVIDIA GeForce RTX 5080 | EditorialM | Q4_K_M | 2K | 443.7tok/s | — | — | May 28, 26 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | EditorialM | Q4_K_M | 4K | 174.2tok/s | — | 129 ms | Jun 2, 26 |
What to do next
Got this model running on real hardware? Share what you measured — the form arrives with the model pre-selected.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Kumru 2B.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run Kumru 2B?
Can I use Kumru 2B commercially?
What's the context length of Kumru 2B?
How do I install Kumru 2B with Ollama?
Source: huggingface.co/vngrs-ai/Kumru-2B
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify Kumru 2B runs on your specific hardware before committing money.