Every benchmark in the corpus, filterable by source, scenario, and reproducibility. Newer rows show cold-start vs steady-state, P5/P95 CI, tokens-per-watt, and accuracy when those fields were captured; legacy rows are labeled as rigor pending.
| Hardware | Model | Quant | Tok/s | Rigor | Source | Date |
|---|---|---|---|---|---|---|
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Turkcell LLM 7B v1 | Q4_K_M | 85.8 | cold85.6 tok/ssteady85.8 tok/sCI85.4–86.1scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | RefinedNeuro RN TR R2 | Q4_K_M | 79.3 | cold78.7 tok/ssteady79.3 tok/sCI78.9–79.5scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | RefinedNeuro RN TR R1 | Q4_K_M | 79.9 | cold79.1 tok/ssteady79.9 tok/sCI79.5–80.8scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Qwen 3 4B | Q4_K_M | 103.7 | cold103.1 tok/ssteady103.7 tok/sCI101.3–106.1scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Qwen 3 14B | Q4_K_M | 38.3 | cold38.8 tok/ssteady38.3 tok/sCI38.2–38.4scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Qwen 2.5 7B Instruct | Q4_K_M | 80.4 | cold80.7 tok/ssteady80.4 tok/sCI79.0–81.7scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Phi-4 Reasoning 14B | Q4_K_M | 40.4 | cold41.0 tok/ssteady40.4 tok/sCI39.7–40.5scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Phi-3.5 Mini Instruct | Q4_K_M | 155.4 | cold154.7 tok/ssteady155.4 tok/sCI152.4–157.3scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Mistral Nemo 12B Instruct | Q4_K_M | 65.7 | cold66.1 tok/ssteady65.7 tok/sCI65.3–66.0scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Mistral 7B Instruct v0.3 | Q4_K_M | 89.6 | cold90.2 tok/ssteady89.6 tok/sCI87.9–91.2scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Llama 3.2 11B Vision Instruct | Q4_K_M | 67.0 | cold67.2 tok/ssteady67.0 tok/sCI67.0–67.1scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Malhajar Mistral 7B Turkish | Q4_K_M | 87.3 | cold83.0 tok/ssteady87.3 tok/sCI82.7–89.6scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Hermes 3 Llama 3.1 8B | Q4_K_M | 81.5 | cold82.2 tok/ssteady81.5 tok/sCI81.3–81.8scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Gemma 4 E4B (Effective 4B) | Q4_K_M | 78.1 | cold79.3 tok/ssteady78.1 tok/sCI77.9–78.4scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Gemma 4 E2B (Effective 2B) | Q4_K_M | 99.1 | cold98.5 tok/ssteady99.1 tok/sCI98.1–101.0scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Gemma 3 4B | Q4_K_M | 97.7 | cold96.4 tok/ssteady97.7 tok/sCI97.2–98.2scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Gemma 3 1B | Q4_K_M | 160.4 | cold156.4 tok/ssteady160.4 tok/sCI159.7–162.0scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Gemma 3 12B | Q4_K_M | 43.3 | cold43.6 tok/ssteady43.3 tok/sCI43.2–43.4scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Gemma 2 9B Instruct | Q4_K_M | 68.2 | cold69.4 tok/ssteady68.2 tok/sCI67.9–69.1scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | DeepSeek R1 Distill Qwen 7B | Q4_K_M | 80.3 | cold80.3 tok/ssteady80.3 tok/sCI79.4–81.6scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | DeepSeek Coder V2 Lite (16B) | Q4_K_M | 152.0 | cold151.2 tok/ssteady152.0 tok/sCI149.9–152.7scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | CodeGemma 7B | Q4_K_M | 80.6 | cold80.2 tok/ssteady80.6 tok/sCI79.2–81.2scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Mistral Turkish v2 (brooqs) | Q4_K_M | 106.8 | cold100.9 tok/ssteady106.8 tok/sCI105.7–107.9scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | YTU Turkish Gemma 9B v0.1 | Q4_K_M | 66.0 | cold66.6 tok/ssteady66.0 tok/sCI65.7–66.8scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Trendyol LLM Asure 12B | Q4_K_M | 43.4 | cold43.7 tok/ssteady43.4 tok/sCI43.1–43.4scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Kumru 2B | Q4_K_M | 174.2 | cold171.9 tok/ssteady174.2 tok/sCI171.7–175.6scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 3080 16GB (Mobile) | Llama 3.2 1B Instruct | Q4_K_M | 189.5 | cold189.9 tok/ssteady189.5 tok/sCI186.8–190.9scnSingle-streamn5 | Measured here | 2026-06-02 |
| NVIDIA GeForce RTX 5080 | Malhajar Mistral 7B Turkish | Q5_K_M | 130.4 | steady130.4 tok/sCI129.6–130.8scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | RefinedNeuro RN TR R2 | Q4_K_M | 133.4 | steady133.4 tok/sCI132.8–133.6scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | RefinedNeuro RN TR R1 | Q4_K_M | 133.6 | steady133.6 tok/sCI133.1–134.0scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | Trendyol LLM Asure 12B | unknown | 79.1 | steady79.1 tok/sCI78.4–79.6scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | YTU Turkish Gemma 9B v0.1 | Q4_K_M | 101.1 | steady101.1 tok/sCI100.6–101.6scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | Turkcell LLM 7B v1 | Q4_K_M | 145.1 | steady145.1 tok/sCI144.2–146.2scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | Mistral Turkish v2 (brooqs) | Q4_0 | 161.1 | steady161.1 tok/sCI159.9–161.7scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | Kumru 2B | Q4_K_M | 443.7 | steady443.7 tok/sCI399.3–452.7scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | Qwen 2.5 Coder 14B Instruct | Q4_K_M | 79.0 | cold77.4 tok/ssteady79.0 tok/sCI78.5–79.1scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | Llama 3.1 8B Instruct | Q4_K_M | 135.6 | cold136.5 tok/ssteady135.6 tok/sCI134.5–137.1scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | Trendyol LLM Asure 12B | Q4_K_M | 82.0 | cold81.7 tok/ssteady82.0 tok/sCI81.7–82.3scnSingle-streamn5 | Measured here | 2026-05-28 |
| NVIDIA GeForce RTX 5080 | Trendyol LLM Asure 12B | Q4_K_M | 61.5 | cold61.6 tok/ssteady61.5 tok/sCI61.5–61.6scnSingle-streamn3 | Measured here | 2026-05-27 |
Showing up to 200 rows, newest first. See /resources/benchmark-protocol for what the rigor pills mean and how to reproduce any row.