llama

8B parameters

Commercial OK

Reviewed May 2026

Turkish Llama 8B Instruct v0.1

Llama 3 8B continued pre-trained on Turkish corpora, then instruction-tuned for Turkish chat. YTU CE COSMOS group's most-downloaded Llama variant. GGUF builds available — drops into Ollama directly.

License: Llama-3-Community·Context: 8,192 tokens

Overview

Strengths

Llama 3 base — proven architecture with broad tooling support
GGUF builds shipped alongside transformers checkpoint
Solid Turkish output quality at the 8B size class

Weaknesses

Continued pre-training can hurt English capability — use a different model if you need bilingual
8K context inherited from Llama 3 base
v0.1 — newer YTU releases may have superseded this

BLK · QUALITY BENCHMARKreviewed · raw logs

Reviewed quality benchmarks

First-party rows were run by RunLocalAI; reviewed community rows are labeled in the data. Every row links to the raw test-run log.

Benchmark	Quant	Runtime / Hardware	Score	Raw log
TurkishMMLU (Generative) tested 2026-05-26	Q4_K_M	ollama-0.24 rtx-3080	11.0/100	Gist →
TurkishMMLU (Generative) tested 2026-05-28	Q4_K_M	ollama-0.24 rtx-3080-16gb-mobile	11.0/100	Gist →

Q4_K_M note:Baseline run on Ollama 0.24 with default 2048 context window. Score is below the 20% random-guess baseline — strong indicator that 5-shot Turkish prompts (which average ~2000 tokens due to morphology) were silently truncated by Ollama. Re-run with --num-ctx 8192 expected to land 30-45%. Published as-is so the methodology improvement is measurable; this row is intentionally NOT promoted to 'verified'.

Q4_K_M note:Re-run on RTX 3080 Laptop (16 GB) with `num_ctx=8192` to test the earlier hypothesis that the prior 11% score was caused by Ollama's default 2048-context window truncating 5-shot Turkish prompts. The re-run **landed at the same 11.00%**, ruling out the truncation hypothesis. The honest reading: Turkish-Llama-8B-Instruct-v0.1 was trained as a **Turkish conversational** model, not a multi-choice reasoning model. It speaks Turkish fluently but underperforms even the 20% random-guess baseline on TurkishMMLU's scientific/historical/literary subjects. Per-subject results: Biology 13%, Chemistry 6%, Geography 15% est., History 11%, Mathematics 12-15% est., Philosophy 15%, Physics 11%, Religious Culture & Ethics 10%, Turkish Language & Literature 12%. Use this model for chat/customer-service Turkish, not for structured Q&A. Higher-knowledge Turkish models (Trendyol Asure 12B at 58.89%) are the right anchor for general-knowledge use cases.

Want to verify? Every row links to its Gist with full stdout and stderr of the run. The runner script is in the public repo (scripts/run-humaneval-plus.ts) — reproducible end-to-end. Browse all coding scores at /benchmarks/coding.

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

Quantization	File size	VRAM required
Q4_K_M	4.4 GB	6 GB

Get the model

HuggingFace

Original weights

huggingface.co/ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of Turkish Llama 8B Instruct v0.1.

NVIDIA B300 (Blackwell Ultra)

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Same tier

Models in the same parameter band as this one

Step up

More capable — bigger memory footprint

Step down

Smaller — faster, runs on weaker hardware

Frequently asked

What's the minimum VRAM to run Turkish Llama 8B Instruct v0.1?

6GB of VRAM is enough to run Turkish Llama 8B Instruct v0.1 at the Q4_K_M quantization (file size 4.4 GB). Higher-quality quantizations need more.

Can I use Turkish Llama 8B Instruct v0.1 commercially?

Yes — Turkish Llama 8B Instruct v0.1 ships under the Llama-3-Community, which permits commercial use. Always read the license text before deployment.

What's the context length of Turkish Llama 8B Instruct v0.1?

Turkish Llama 8B Instruct v0.1 supports a context window of 8,192 tokens (about 8K).

Source: huggingface.co/ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Recommended hardware

Before you buy

Verify Turkish Llama 8B Instruct v0.1 runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →