RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Benchmarks
  4. /Cohorts
✓Editorial(Live coverage report)

Benchmark cohort coverage

The intelligence graph compares your benchmark to its cohort — same model, same hardware, same quant bucket, same context bucket. Cohorts under 5 measurements can't produce confident outlier flags. This page surfaces which cohorts have signal and which are underpowered.

The cohorts ranked first below are ones where one or two more measurements would unlock real intelligence. If you have the rig, the “reproduce” CTA on each row prefills the submission form.

Total cohorts
39
Very-high tier
0
Underpowered
39
Single-runtime only
39

Cohorts where one more measurement matters

Ranked: low / moderate confidence first, then proximity to the 5-row outlier-detection threshold, then recency. A measurement landing on any of these tips it across the line.

CohortConfidenceRowsReproducedLatestAction
llama-3.2-1b-instruct on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
kumru-2b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
trendyol-llm-asure-12b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
ytu-turkish-gemma-9b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
brooqs-mistral-turkish-v2-latest on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
codegemma-7b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
deepseek-coder-v2-lite on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
deepseek-r1-distill-qwen-7b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
gemma-2-9b-it on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
gemma-3-12b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
gemma-3-1b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
gemma-3-4b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
gemma-4-e2b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
gemma-4-e4b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
hermes-3-llama-3.1-8b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
mistral-7b-turkish on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
llama-3.2-11b-vision-instruct on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
mistral-7b-instruct-v0.3 on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
mistral-nemo-12b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
phi-3.5-mini-instruct on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
phi-4-reasoning-14b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
qwen-2.5-7b-instruct on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
qwen-3-14b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
qwen-3-4b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
rn-tr-r1 on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
rn-tr-r2 on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
turkcell-llm-7b on rtx-3080-16gb-mobile
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-06-02Reproduce →
trendyol-llm-asure-12b on rtx-5080
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
llama-3.1-8b-instruct on rtx-5080
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
qwen-2.5-coder-14b-instruct on rtx-5080
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
kumru-2b on rtx-5080
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
brooqs-mistral-turkish-v2-latest on rtx-5080
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
turkcell-llm-7b on rtx-5080
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
ytu-turkish-gemma-9b on rtx-5080
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
trendyol-llm-asure-12b on rtx-5080
unknown · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
rn-tr-r1 on rtx-5080
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
rn-tr-r2 on rtx-5080
4-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
mistral-7b-turkish on rtx-5080
5-bit · ≤4K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-28Reproduce →
trendyol-llm-asure-12b on rtx-5080
4-bit · 4-8K
  • · Single-source cohort; nothing to compare against.
Low
102026-05-27Reproduce →

How cohort confidence is derived

Cohort labels mirror the per-benchmark confidence engine: low / moderate / high / very-high. Never percentages.

  • Very-high: ≥5 measurements + ≥2 reproductions.
  • High: ≥5 measurements, reproduction count low.
  • Moderate: 3-4 measurements, below the outlier-detection threshold.
  • Low: 1-2 measurements, single-source. The intelligence graph cannot draw conclusions.

A cohort that's last-touched >18 months ago gets demoted one tier — runtime + driver drift since then is real. A cohort that has only one runtime represented gets called out; runtime-drift signal is absent until a second runtime lands.

Next recommended step

Editorial-curated benchmark opportunities ranked by impact.

See the public benchmark roadmap
OrSubmit a benchmarkBrowse benchmarks