RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Models
  4. /Trendyol LLM Asure 12B
gemma
11.8B parameters
Commercial OK
Multimodal
·Reviewed May 2026

Trendyol LLM Asure 12B

Trendyol LLM Asure 12B is a Gemma 3 based multimodal instruct model for Turkish and English business workflows. The public Ollama build used in local testing is the alibayram GGUF distribution.

License: Gemma·Context: 131,072 tokens

Overview

Trendyol LLM Asure 12B is a Gemma 3 based multimodal instruct model for Turkish and English business workflows. The public Ollama build used in local testing is the alibayram GGUF distribution.

How to run it

The locally tested route is Ollama with the alibayram/Trendyol-LLM-Asure-12B:latest tag, which points at the Q4_K_M GGUF mirror. On a 16GB RTX 5080 it loads comfortably for text-only chat and TurkishMMLU-style evaluation; keep num_ctx explicit because Ollama defaults can silently truncate 5-shot benchmark prompts.

Hardware guidance

The Q4_K_M GGUF is 7.3GB on disk. Plan for roughly 10GB+ of VRAM for normal chat and more headroom as context grows. The 131K advertised context is useful for long inputs, but high-context serving should be profiled because KV cache, batch size, and image inputs can dominate memory.

What breaks first

The first failure mode is context truncation: use a fixed num_ctx for benchmark runs. The second is over-reading the model as a general world-knowledge system; its own card says world knowledge is intentionally limited. Vision capability is part of the base model, but this TurkishMMLU run is text-only.

Runtime recommendation

Use Ollama for quick local text runs and llama.cpp or vLLM when you need tighter control over context, batching, or production serving. For reproducible quality runs, pin runtime version, quant, hardware, num_ctx, and publish the raw log.

Common beginner mistakes

Do not benchmark the model with Ollama's default 2048 context. Do not compare this Q4_K_M local run to BF16 vendor claims without labeling the quant. Do not treat the multimodal claim as measured by TurkishMMLU; this benchmark covers text-only Turkish multiple-choice reasoning.

Family & lineage

How this model relates to others in its lineage. Family members share architecture and training-data roots; parent / children edges record direct distillation or fine-tune relationships.

Family siblings (gemma-3)
Gemma 3 1B1B
Edge
Gemma 3 4B4B
Edge
Trendyol LLM Asure 12B11.8B
You are here
Gemma 3 12B12B
Consumer
Gemma 3 27B27B
Workstation

Strengths

  • Strong domestic business-workflow positioning
  • Gemma 3 multimodal lineage with Turkish and English coverage
  • Multiple local RTX 5080 measurements are now available

Weaknesses

  • The benchmarked Ollama summary reports quantization as unknown
  • 12B class is slower than compact 2B-9B Turkish models
  • Vision quality needs a separate multimodal benchmark, not a text TPS row

Prompting kit

✓ Tested by runlocalai
on 2026-05-27· rtx-5080

Tested patterns for getting the most out of Trendyol LLM Asure 12B locally. Local models are pickier about prompt structure than cloud models — what works on Claude or GPT-5 often fails here.

Quirks to know

  • •Gemma-style <start_of_turn>/<end_of_turn> chat template
  • •Pass num_ctx explicitly for benchmark prompts
  • •The model is tuned for concise business-task responses, not broad trivia

Chat template

Gemma 3

Ollama injects the system prompt into the first user turn and uses Gemma turn markers.

Tool calling

✗ Not supported

No native tool-calling format was advertised or tested for this local benchmark.

Sampler settings

temperature
0
top_p
1

Quality benchmarks use deterministic generation with max_tokens=8 and letter parsing.

Browse prompting kits for every model →/prompting
BLK · QUALITY BENCHMARKreviewed · raw logs

Reviewed quality benchmarks

First-party rows were run by RunLocalAI; reviewed community rows are labeled in the data. Every row links to the raw test-run log.

BenchmarkQuantRuntime / HardwareScoreRaw log
MBPP+
tested 2026-05-27
Q4_K_M
ollama-0.24.0
rtx-5080
71.7/100
Gist →
HumanEval+
tested 2026-05-27
Q4_K_M
ollama-0.24.0
rtx-5080
69.5/100
Gist →
TurkishMMLU (Generative)
tested 2026-05-27
Q4_K_M
ollama-0.24.0
rtx-5080
58.9/100
Gist →

Q4_K_M note:First-party measured MBPP+ run. Generation used Ollama's OpenAI-compatible chat endpoint at temperature 0 and num_ctx 8192. Scoring used official EvalPlus 0.3.1 under WSL; public Gist includes metadata, generation log, official scorer log, sanitized samples, and raw model completions.

Q4_K_M note:First-party measured HumanEval+ run. Generation used Ollama's OpenAI-compatible chat endpoint at temperature 0 and num_ctx 8192. Scoring used official EvalPlus 0.3.1 under WSL; public Gist includes metadata, generation log, official scorer log, sanitized samples, and raw model completions.

Q4_K_M note:First-party text-only TurkishMMLU generative run on local Ollama tag alibayram/Trendyol-LLM-Asure-12B:latest. Source model card: alibayram/Trendyol-LLM-Asure-12B; local GGUF source: alibayram/Trendyol-LLM-Asure-12B-Q4_K_M-GGUF. Hardware: RTX 5080 16GB, NVIDIA driver 595.97.

Want to verify? Every row links to its Gist with full stdout and stderr of the run. The runner script is in the public repo (scripts/run-humaneval-plus.ts) — reproducible end-to-end. Browse all coding scores at /benchmarks/coding.

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

QuantizationFile sizeVRAM required
GGUF_UNKNOWN7.3 GB10 GB

Get the model

Ollama

One-line install

ollama run alibayram/Trendyol-LLM-Asure-12B:latestRead our Ollama review →

HuggingFace

Original weights

huggingface.co/Trendyol/Trendyol-LLM-Asure-12B

Source repository — direct quantization required.

Benchmarks

Real measurements on real hardware. Numbers ship with the runner version, quant, and date.

3 runs on record
HardwareProvenanceQuantCtxTokens / secVRAMTTFTDate
NVIDIA GeForce RTX 5080
✓EditorialM
Q4_K_M4K
82.0tok/s
—136 msMay 28, 26
NVIDIA GeForce RTX 5080
✓EditorialM
unknown2K
79.1tok/s
——May 28, 26
NVIDIA GeForce RTX 5080
✓EditorialM
Q4_K_M8K
61.5tok/s
—323 msMay 27, 26
§How we measure
Every benchmark on this site ships with the runner version, driver version, prompt, and date. Predictions are graded with confidence badges (M / C / ~ / E) so you know which numbers to trust for purchasing decisions. Read the methodology →
Help keep this page accurate

We read every submission. Editorial review takes 1-7 days.

Submit a benchmarkReport outdatedSuggest a correction

What to do next

Got this model running on real hardware? Share what you measured — the form arrives with the model pre-selected.

Submit a benchmark for Trendyol LLM Asure 12B
OrBrowse the benchmark roadmapCompare hardware options

Hardware that runs this

Cards with enough VRAM for at least one quantization of Trendyol LLM Asure 12B.

NVIDIA GB200 NVL72
13824GB · nvidia
AMD Instinct MI355X
288GB · amd
AMD Instinct MI325X
256GB · amd
AMD Instinct MI300X
192GB · amd
NVIDIA B200
192GB · nvidia
NVIDIA H100 NVL
188GB · nvidia
NVIDIA H200
141GB · nvidia
NVIDIA H200 NVL (PCIe)
141GB · nvidia

Frequently asked

What's the minimum VRAM to run Trendyol LLM Asure 12B?

10GB of VRAM is enough to run Trendyol LLM Asure 12B at the GGUF_UNKNOWN quantization (file size 7.3 GB). Higher-quality quantizations need more.

Can I use Trendyol LLM Asure 12B commercially?

Yes — Trendyol LLM Asure 12B ships under the Gemma, which permits commercial use. Always read the license text before deployment.

What's the context length of Trendyol LLM Asure 12B?

Trendyol LLM Asure 12B supports a context window of 131,072 tokens (about 131K).

How do I install Trendyol LLM Asure 12B with Ollama?

Run `ollama pull alibayram/Trendyol-LLM-Asure-12B:latest` to download, then `ollama run alibayram/Trendyol-LLM-Asure-12B:latest` to start a chat session. The default quantization is Q4_K_M.

Does Trendyol LLM Asure 12B support images?

Yes — Trendyol LLM Asure 12B is multimodal and accepts text + vision inputs. Vision support requires a runner that handles its image-conditioning architecture.

Source: huggingface.co/Trendyol/Trendyol-LLM-Asure-12B

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Compare hardware
  • 4060 Ti 16 GB vs 4070 Ti Super →
  • Arc B580 vs 4060 Ti 16 GB →
Buyer guides
  • Best budget GPU — for 7B-13B models →
  • Best GPU for local AI →
  • Best laptop for local AI →
  • Best Mac for local AI →
  • Best used GPU for local AI →
When it doesn't work
  • CUDA out of memory →
  • Ollama running slowly →
  • ROCm not detected →
  • Model keeps crashing →
Recommended hardware
  • NVIDIA GB200 NVL72 →
  • AMD Instinct MI355X →
  • AMD Instinct MI325X →
  • AMD Instinct MI300X →
  • NVIDIA B200 →
Alternatives
Gemma 3 1BGemma 3 12BGemma 3 4BGemma 3 27B
Before you buy

Verify Trendyol LLM Asure 12B runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →
Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Same tier
Models in the same parameter band as this one
  • FLUX.1 [dev]
    other · 12B
    unrated
  • FLUX.1 [schnell]
    other · 12B
    unrated
  • DeepSeek R1 Distill Qwen 14B
    deepseek · 14B
    unrated
  • Llama 3.2 11B Vision Instruct
    llama · 11B
    unrated
Step up
More capable — bigger memory footprint
  • DeepSeek V2 Lite Chat
    deepseek · 15.7B
    unrated
  • Granite 3 MoE (3B active)
    granite · 16B
    unrated
Step down
Smaller — faster, runs on weaker hardware
  • DeepSeek R1 Distill Qwen 7B
    deepseek · 7B
    unrated
  • Llama 3.1 Nemotron Nano 8B
    llama · 8B
    unrated