Fits comfortably

Running Trendyol LLM Asure 12B on NVIDIA GeForce RTX 3080 16GB (Mobile)

NVIDIA GeForce RTX 3080 16GB (Mobile) runs Trendyol LLM Asure 12B comfortably at GGUF_UNKNOWN with 6 GB of headroom for context.

By Eruo Fredoline·Latest benchmark evidence Jun 2, 2026

Model size

11.8B params

Trendyol LLM Asure 12B →

Memory available

16 GB

NVIDIA GeForce RTX 3080 16GB (Mobile) →

Recommended quant

GGUF_UNKNOWN

Highest quality that fits

Quick start with Ollama

1. Install

ollama pull alibayram/Trendyol-LLM-Asure-12B:latest

2. Run

ollama run alibayram/Trendyol-LLM-Asure-12B:latest

Default quant in Ollama is Q4_K_M. To use a different quant, append it: alibayram/Trendyol-LLM-Asure-12B:latest-q5_K_M.

Variants and what fits

Quantization	File size	VRAM required	Fits on NVIDIA GeForce RTX 3080 16GB (Mobile)?
GGUF_UNKNOWN	7.3 GB	10 GB	Yes

Real benchmarks

Tool	Quant	Context	tok/s	VRAM used	Date	Evidence	Export
—	Q4_K_M	4,096	43.4 tok/s	—	Jun 2, 2026	Measured here operator: fred-oline	Detail Source JSON

Frequently asked

Can NVIDIA GeForce RTX 3080 16GB (Mobile) run Trendyol LLM Asure 12B?

NVIDIA GeForce RTX 3080 16GB (Mobile) runs Trendyol LLM Asure 12B comfortably at GGUF_UNKNOWN with 6 GB of headroom for context.

What quantization should I use?

GGUF_UNKNOWN is the highest-quality variant of Trendyol LLM Asure 12B that fits in 16 GB VRAM. Lower-bit quants will be smaller but lose some quality.

How fast will it be?

Measured at 43.4 tok/s on this combination in our testing.

Reviewed by RunLocalAI Editorial. See our editorial policy.