RUNLOCALAIv38
→WILL IT RUNBEST GPUCOMPARETROUBLESHOOTSTARTPULSEMODELSHARDWARETOOLSBENCH
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Quick answers
REF
  • All buyer guides
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
← Home

Choose my GPU

Answer nine questions. We rank the GPUs in our catalog by fit for local AI on your stack — top picks, alternates, and what to avoid. Hand-written rationale per card, honest caveats, and a one-click handoff into the custom build engine.

We don’t fake tok/s numbers. Every recommendation cites a model class and a workload-realistic range. Cards over your budget appear last with explicit framing. Recommendations are rule-based scoring, not measured benchmarks.

Tell us about your build

URL updates as you change fields.

Price vs performance (budget-neutral)
5 cards · 5 skipped (no price)
0255075100$500$1,000$2,000$4,000$8,000Effective price (log)Performance (budget-neutral)your budgetNVIDIA H100 PCIe — $25,000 · AvoidNVIDIA RTX 5000 PRO Blackwell 48GB — $5,499 · AvoidNVIDIA RTX A6000 (Ampere) — $3,500 · AvoidNVIDIA RTX 2080 Ti 22GB (China-mod) — $350 · Worth saving up forIntel Arc A770 16GB — $269 · Alternate
Top pick (1)
Alternate (5)
Worth saving up for (1)
Avoid (3)
Top picks
1 card matching your stack tightly · 5 alternates
Tick two cards to compare side-by-side
Top pick
nvidia24 GB
Operator-grade

NVIDIA GeForce RTX 5090 Mobile

Top pick for your setup. With your $300 budget on Linux for coding agents, the NVIDIA GeForce RTX 5090 Mobile ranks here because 24 GB hits the workable band for coding agents — fits at sensible quants without becoming the bottleneck.

Realistic model class
Qwen 2.5 Coder 32B Q4 + 32K context
Expected throughput
30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.
Evidence
live data · editorial + reproduced community
Editorial
0benchmarks
Reproduced
0community
Stale (>18mo)
0rows
Cohort confidence
—
none
Needs measurement
This recommendation is rule-based, not evidence-backed yet.
  • No benchmarks on file for this hardware.
Help us measure NVIDIA GeForce RTX 5090 Mobile →
How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

  • VRAM × workloadweight 22%70Good
  • Budget fitweight 18%50Acceptable
  • OS compatibilityweight 16%100Excellent
  • Skill matchweight 10%95Excellent
  • Power headroomweight 8%95Excellent
  • Multi-GPU pathweight 8%80Strong
  • Thermal / noiseweight 6%95Excellent
  • Gaming alignmentweight 6%95Excellent
  • Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Try in custom builder →See model-fit tableRecommended runtime: ollama
·Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 5090 Mobile
Alternate
nvidia13824 GB
Operator-grade

NVIDIA GB200 NVL72

Strong alternate. With your $300 budget on Linux for coding agents, the NVIDIA GB200 NVL72 ranks here because with 13824 GB and ~8000 GB/s memory bandwidth, this clears the VRAM bar for coding agents comfortably.

Realistic model class
Frontier-scale — 100B+ MoE territory
Expected throughput
Datacenter throughput regime — measure with your real workload.
Evidence
live data · editorial + reproduced community
Editorial
0benchmarks
Reproduced
0community
Stale (>18mo)
0rows
Cohort confidence
—
none
Needs measurement
This recommendation is rule-based, not evidence-backed yet.
  • No benchmarks on file for this hardware.
Help us measure NVIDIA GB200 NVL72 →
How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

  • VRAM × workloadweight 22%100Excellent
  • Budget fitweight 18%50Acceptable
  • OS compatibilityweight 16%100Excellent
  • Skill matchweight 10%95Excellent
  • Power headroomweight 8%5Poor
  • Multi-GPU pathweight 8%80Strong
  • Thermal / noiseweight 6%75Strong
  • Gaming alignmentweight 6%85Strong
  • Perf-per-wattweight 6%25Weak

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats
  • •Sustained ~120000W — plan for a 1000W+ PSU and adequate case airflow.
Try in custom builder →See model-fit tableRecommended runtime: ollama
·Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GB200 NVL72
Alternate
nvidia16 GB
Operator-grade

NVIDIA GeForce RTX 4090 Mobile

Strong alternate. With your $300 budget on Linux for coding agents, the NVIDIA GeForce RTX 4090 Mobile sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class
Qwen 2.5 Coder 14B FP16, agents OK
Expected throughput
40-70 tok/s on 7B Q4; 20-35 tok/s on 13B Q4.
Evidence
live data · editorial + reproduced community
Editorial
0benchmarks
Reproduced
0community
Stale (>18mo)
0rows
Cohort confidence
—
none
Needs measurement
This recommendation is rule-based, not evidence-backed yet.
  • No benchmarks on file for this hardware.
Help us measure NVIDIA GeForce RTX 4090 Mobile →
How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

  • VRAM × workloadweight 22%33Weak
  • Budget fitweight 18%50Acceptable
  • OS compatibilityweight 16%100Excellent
  • Skill matchweight 10%95Excellent
  • Power headroomweight 8%95Excellent
  • Multi-GPU pathweight 8%80Strong
  • Thermal / noiseweight 6%95Excellent
  • Gaming alignmentweight 6%95Excellent
  • Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats
  • •16 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.
Try in custom builder →See model-fit tableRecommended runtime: ollama
·Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 4090 Mobile
Alternate
nvidia16 GB
Operator-grade

NVIDIA GeForce RTX 3080 16GB (Mobile)

Strong alternate. With your $300 budget on Linux for coding agents, the NVIDIA GeForce RTX 3080 16GB (Mobile) sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class
Qwen 2.5 Coder 14B FP16, agents OK
Expected throughput
40-70 tok/s on 7B Q4; 20-35 tok/s on 13B Q4.
Evidence
live data · editorial + reproduced community
Editorial
1benchmarks
Reproduced
0community
Stale (>18mo)
0rows
Cohort confidence
Moderate
1 cohort
Needs measurement
This recommendation is rule-based, not evidence-backed yet.
  • Only 1 benchmark — below the 5-row threshold for cohort signal.
Help us measure NVIDIA GeForce RTX 3080 16GB (Mobile) →
Measured throughput
top 1 of 1 on file · most recent first
  • ed
    qwen 2.5 coder 7b instructQ4_K_M
    79.4tok/s2026-05
Show 1 benchmark feeding this card▸
  • ed
    #337qwen-2.5-coder-7b-instruct · Q4_K_M
    79.4 tok/s2026-05-10
How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

  • VRAM × workloadweight 22%33Weak
  • Budget fitweight 18%50Acceptable
  • OS compatibilityweight 16%100Excellent
  • Skill matchweight 10%95Excellent
  • Power headroomweight 8%95Excellent
  • Multi-GPU pathweight 8%80Strong
  • Thermal / noiseweight 6%95Excellent
  • Gaming alignmentweight 6%90Excellent
  • Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats
  • •16 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.
Try in custom builder →See model-fit tableRecommended runtime: ollama
·Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 3080 16GB (Mobile)
Alternate
intel16 GB~$269
Operator-grade

Intel Arc A770 16GB

Strong alternate. With your $300 budget on Linux for coding agents, the Intel Arc A770 16GB sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class
Qwen 2.5 Coder 14B FP16, agents OK
Expected throughput
Single-stream chat in the 20-50 tok/s range on 7B-class Q4. Higher tiers unrealistic.
Evidence
live data · editorial + reproduced community
Editorial
0benchmarks
Reproduced
0community
Stale (>18mo)
0rows
Cohort confidence
—
none
Needs measurement
This recommendation is rule-based, not evidence-backed yet.
  • No benchmarks on file for this hardware.
Help us measure Intel Arc A770 16GB →
How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

  • VRAM × workloadweight 22%33Weak
  • Budget fitweight 18%95Excellent
  • OS compatibilityweight 16%70Good
  • Skill matchweight 10%55Acceptable
  • Power headroomweight 8%80Strong
  • Multi-GPU pathweight 8%80Strong
  • Thermal / noiseweight 6%95Excellent
  • Gaming alignmentweight 6%70Good
  • Perf-per-wattweight 6%85Strong

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats
  • •16 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.
  • •Intel discrete GPU AI tooling lags NVIDIA — runtime support exists (IPEX-LLM) but documentation is thinner.
Try in custom builder →See model-fit tableRecommended runtime: ollama
·Estimated(rule-based scoring)Help us measure this — submit a benchmark for Intel Arc A770 16GB
Alternate
amd128 GB
Operator-grade

AMD Instinct MI300A (APU)

Strong alternate. With your $300 budget on Linux for coding agents, the AMD Instinct MI300A (APU) ranks here because with 128 GB and ~? GB/s memory bandwidth, this clears the VRAM bar for coding agents comfortably.

Realistic model class
Frontier-scale — 100B+ MoE territory
Expected throughput
Datacenter throughput regime — measure with your real workload.
Evidence
live data · editorial + reproduced community
Editorial
0benchmarks
Reproduced
0community
Stale (>18mo)
0rows
Cohort confidence
—
none
Needs measurement
This recommendation is rule-based, not evidence-backed yet.
  • No benchmarks on file for this hardware.
Help us measure AMD Instinct MI300A (APU) →
How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

  • VRAM × workloadweight 22%100Excellent
  • Budget fitweight 18%50Acceptable
  • OS compatibilityweight 16%70Good
  • Skill matchweight 10%60Good
  • Power headroomweight 8%5Poor
  • Multi-GPU pathweight 8%80Strong
  • Thermal / noiseweight 6%75Strong
  • Gaming alignmentweight 6%85Strong
  • Perf-per-wattweight 6%25Weak

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats
  • •Sustained ~760W — plan for a 1000W+ PSU and adequate case airflow.
Try in custom builder →See model-fit tableRecommended runtime: llama-cpp
·Estimated(rule-based scoring)Help us measure this — submit a benchmark for AMD Instinct MI300A (APU)
Worth saving up for
1 card over budget by ≤25% — top-tier on every dimension except price
Worth saving up for
nvidia22 GB~$350
Operator-grade

NVIDIA RTX 2080 Ti 22GB (China-mod)

Out of budget for this query. With your $300 budget on Linux for coding agents, the NVIDIA RTX 2080 Ti 22GB (China-mod) ranks here because street price around $350 sits above your $300 budget — listed for the upgrade-path conversation, not as a recommendation.

17% over budget ($50 extra). On every dimension other than price this card would have landed top-tier — surfaced here so the upgrade path is visible.
Realistic model class
Qwen 2.5 Coder 32B Q4 + 32K context
Expected throughput
30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.
Evidence
live data · editorial + reproduced community
Editorial
0benchmarks
Reproduced
0community
Stale (>18mo)
0rows
Cohort confidence
—
none
Needs measurement
This recommendation is rule-based, not evidence-backed yet.
  • No benchmarks on file for this hardware.
Help us measure NVIDIA RTX 2080 Ti 22GB (China-mod) →
How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

  • VRAM × workloadweight 22%46Acceptable
  • Budget fitweight 18%35Weak
  • OS compatibilityweight 16%100Excellent
  • Skill matchweight 10%95Excellent
  • Power headroomweight 8%80Strong
  • Multi-GPU pathweight 8%80Strong
  • Thermal / noiseweight 6%95Excellent
  • Gaming alignmentweight 6%70Good
  • Perf-per-wattweight 6%85Strong

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats
  • •Out of budget — street price around $350 vs your $300 budget.
  • •22 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.
Try in custom builder →See model-fit tableRecommended runtime: ollama
·Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA RTX 2080 Ti 22GB (China-mod)
Why we ruled these out
Over-budget or fundamentally incompatible — listed for the upgrade-path conversation
  • NVIDIA H100 PCIe — Out of budget for this query.
    ~$25,000
  • NVIDIA RTX 5000 PRO Blackwell 48GB — Out of budget for this query.
    ~$5,499
  • NVIDIA RTX A6000 (Ampere) — Out of budget for this query.
    ~$3,500

Where to go from here

Live GPU price tracker →

Multi-store, multi-region prices for every card here. US/EU/UK/CA/AU — see what these cards actually cost in your region before you buy.

Stack Builder →

One step further: this card + runtime + 1-3 models + cost rollup + ready-to-paste install script. Eight inputs → full rig.

Custom build engine →

Once you’ve picked a card, model the full build (CPU, RAM, runtime) for which models fit comfortably.

GPU buying guide 2026 →

The long-form essay version: VRAM tiers, MoE math, NVLink truth, used-market price discipline.

Hardware combinations →

Curated multi-GPU and Apple-cluster setups with effective-VRAM math you can trust.

Scoring methodology →

How the trust layer behind these recommendations actually works — every dimension, every formula, the honest limits.

Cohort coverage report →

Where the intelligence graph has signal vs which model × hardware × quant cohorts are still underpowered.

Reproduce a benchmark →

Help tip a cohort across the 5-row threshold for outlier detection — the most operator-impactful contribution.