Fits comfortably
Running YTU Turkish Gemma 9B v0.1 on NVIDIA GeForce RTX 3080 16GB (Mobile)
NVIDIA GeForce RTX 3080 16GB (Mobile) runs YTU Turkish Gemma 9B v0.1 comfortably at Q4_K_M with 8 GB of headroom for context.
Model size
9.2B params
YTU Turkish Gemma 9B v0.1 →Memory available
Recommended quant
Q4_K_M
Highest quality that fits
Quick start with Ollama
1. Install
ollama pull alibayram/turkish-gemma-9b-v0.1:latest2. Run
ollama run alibayram/turkish-gemma-9b-v0.1:latestDefault quant in Ollama is Q4_K_M. To use a different quant, append it: alibayram/turkish-gemma-9b-v0.1:latest-q5_K_M.
Variants and what fits
| Quantization | File size | VRAM required | Fits on NVIDIA GeForce RTX 3080 16GB (Mobile)? |
|---|---|---|---|
| Q4_K_M | 5.8 GB | 8 GB | Yes |
Real benchmarks
Frequently asked
Can NVIDIA GeForce RTX 3080 16GB (Mobile) run YTU Turkish Gemma 9B v0.1?
NVIDIA GeForce RTX 3080 16GB (Mobile) runs YTU Turkish Gemma 9B v0.1 comfortably at Q4_K_M with 8 GB of headroom for context.
What quantization should I use?
Q4_K_M is the highest-quality variant of YTU Turkish Gemma 9B v0.1 that fits in 16 GB VRAM. Lower-bit quants will be smaller but lose some quality.
How fast will it be?
Measured at 66.0 tok/s on this combination in our testing.
See also: YTU Turkish Gemma 9B v0.1, NVIDIA GeForce RTX 3080 16GB (Mobile), all benchmarks.
Reviewed by RunLocalAI Editorial. See our editorial policy.