Trendyol LLM Asure 12B
Trendyol LLM Asure 12B is a Gemma 3 based multimodal instruct model for Turkish and English business workflows. The public Ollama build used in local testing is the alibayram GGUF distribution.
Overview
Trendyol LLM Asure 12B is a Gemma 3 based multimodal instruct model for Turkish and English business workflows. The public Ollama build used in local testing is the alibayram GGUF distribution.
How to run it
The locally tested route is Ollama with the alibayram/Trendyol-LLM-Asure-12B:latest tag, which points at the Q4_K_M GGUF mirror. On a 16GB RTX 5080 it loads comfortably for text-only chat and TurkishMMLU-style evaluation; keep num_ctx explicit because Ollama defaults can silently truncate 5-shot benchmark prompts.
Hardware guidance
The Q4_K_M GGUF is 7.3GB on disk. Plan for roughly 10GB+ of VRAM for normal chat and more headroom as context grows. The 131K advertised context is useful for long inputs, but high-context serving should be profiled because KV cache, batch size, and image inputs can dominate memory.
What breaks first
The first failure mode is context truncation: use a fixed num_ctx for benchmark runs. The second is over-reading the model as a general world-knowledge system; its own card says world knowledge is intentionally limited. Vision capability is part of the base model, but this TurkishMMLU run is text-only.
Runtime recommendation
Use Ollama for quick local text runs and llama.cpp or vLLM when you need tighter control over context, batching, or production serving. For reproducible quality runs, pin runtime version, quant, hardware, num_ctx, and publish the raw log.
Common beginner mistakes
Do not benchmark the model with Ollama's default 2048 context. Do not compare this Q4_K_M local run to BF16 vendor claims without labeling the quant. Do not treat the multimodal claim as measured by TurkishMMLU; this benchmark covers text-only Turkish multiple-choice reasoning.
Family & lineage
How this model relates to others in its lineage. Family members share architecture and training-data roots; parent / children edges record direct distillation or fine-tune relationships.
Strengths
- Strong domestic business-workflow positioning
- Gemma 3 multimodal lineage with Turkish and English coverage
- Multiple local RTX 5080 measurements are now available
Weaknesses
- The benchmarked Ollama summary reports quantization as unknown
- 12B class is slower than compact 2B-9B Turkish models
- Vision quality needs a separate multimodal benchmark, not a text TPS row
Prompting kit
Tested patterns for getting the most out of Trendyol LLM Asure 12B locally. Local models are pickier about prompt structure than cloud models — what works on Claude or GPT-5 often fails here.
Quirks to know
- •Gemma-style <start_of_turn>/<end_of_turn> chat template
- •Pass num_ctx explicitly for benchmark prompts
- •The model is tuned for concise business-task responses, not broad trivia
Chat template
Ollama injects the system prompt into the first user turn and uses Gemma turn markers.
Tool calling
No native tool-calling format was advertised or tested for this local benchmark.
Sampler settings
- temperature
- 0
- top_p
- 1
Quality benchmarks use deterministic generation with max_tokens=8 and letter parsing.
Reviewed quality benchmarks
First-party rows were run by RunLocalAI; reviewed community rows are labeled in the data. Every row links to the raw test-run log.
| Benchmark | Quant | Runtime / Hardware | Score | Raw log |
|---|---|---|---|---|
MBPP+ tested 2026-05-27 | Q4_K_M | ollama-0.24.0 rtx-5080 | 71.7/100 | Gist → |
HumanEval+ tested 2026-05-27 | Q4_K_M | ollama-0.24.0 rtx-5080 | 69.5/100 | Gist → |
TurkishMMLU (Generative) tested 2026-05-27 | Q4_K_M | ollama-0.24.0 rtx-5080 | 58.9/100 | Gist → |
Q4_K_M note:First-party measured MBPP+ run. Generation used Ollama's OpenAI-compatible chat endpoint at temperature 0 and num_ctx 8192. Scoring used official EvalPlus 0.3.1 under WSL; public Gist includes metadata, generation log, official scorer log, sanitized samples, and raw model completions.
Q4_K_M note:First-party measured HumanEval+ run. Generation used Ollama's OpenAI-compatible chat endpoint at temperature 0 and num_ctx 8192. Scoring used official EvalPlus 0.3.1 under WSL; public Gist includes metadata, generation log, official scorer log, sanitized samples, and raw model completions.
Q4_K_M note:First-party text-only TurkishMMLU generative run on local Ollama tag alibayram/Trendyol-LLM-Asure-12B:latest. Source model card: alibayram/Trendyol-LLM-Asure-12B; local GGUF source: alibayram/Trendyol-LLM-Asure-12B-Q4_K_M-GGUF. Hardware: RTX 5080 16GB, NVIDIA driver 595.97.
Want to verify? Every row links to its Gist with full stdout and stderr of the run. The runner script is in the public repo (scripts/run-humaneval-plus.ts) — reproducible end-to-end. Browse all coding scores at /benchmarks/coding.
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| GGUF_UNKNOWN | 7.3 GB | 10 GB |
Get the model
Ollama
One-line install
ollama run alibayram/Trendyol-LLM-Asure-12B:latestRead our Ollama review →HuggingFace
Original weights
Source repository — direct quantization required.
Benchmarks
Real measurements on real hardware. Numbers ship with the runner version, quant, and date.
| Hardware | Provenance | Quant | Ctx | Tokens / sec | VRAM | TTFT | Date |
|---|---|---|---|---|---|---|---|
| NVIDIA GeForce RTX 5080 | EditorialM | Q4_K_M | 4K | 82.0tok/s | — | 136 ms | May 28, 26 |
| NVIDIA GeForce RTX 5080 | EditorialM | unknown | 2K | 79.1tok/s | — | — | May 28, 26 |
| NVIDIA GeForce RTX 5080 | EditorialM | Q4_K_M | 8K | 61.5tok/s | — | 323 ms | May 27, 26 |
What to do next
Got this model running on real hardware? Share what you measured — the form arrives with the model pre-selected.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Trendyol LLM Asure 12B.
Frequently asked
What's the minimum VRAM to run Trendyol LLM Asure 12B?
Can I use Trendyol LLM Asure 12B commercially?
What's the context length of Trendyol LLM Asure 12B?
How do I install Trendyol LLM Asure 12B with Ollama?
Does Trendyol LLM Asure 12B support images?
Source: huggingface.co/Trendyol/Trendyol-LLM-Asure-12B
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify Trendyol LLM Asure 12B runs on your specific hardware before committing money.