Choose my GPU

Answer nine questions. We rank the GPUs in our catalog by fit for local AI on your stack — top picks, alternates, and what to avoid. Hand-written rationale per card, honest caveats, and a one-click handoff into the custom build engine.

We don’t fake tok/s numbers. Every recommendation cites a model class and a workload-realistic range. Cards over your budget appear last with explicit framing. Recommendations are rule-based scoring, not measured benchmarks.

Tell us about your build

URL updates as you change fields.

Budget?

Operating system?

Primary workload?

Your skill level?

System power limit?

Loudness tolerance?

Gaming priority?

Electricity sensitivity?

Plan to add a 2nd / 3rd GPU later?

Display currency?

Price vs performance (budget-neutral)

5 cards · 5 skipped (no price)

Top pick (1)

Alternate (5)

Worth saving up for (1)

Avoid (3)

Top picks

1 card matching your stack tightly · 5 alternates

Tick two cards to compare side-by-side

Compare

Top pick

nvidia24 GB

Operator-grade

NVIDIA GeForce RTX 5090 Mobile

Top pick for your setup. With your $300 budget on Linux for coding agents, the NVIDIA GeForce RTX 5090 Mobile ranks here because 24 GB hits the workable band for coding agents — fits at sensible quants without becoming the bottleneck.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA GeForce RTX 5090 Mobile →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%70Good
Budget fitweight 18%50Acceptable
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%95Excellent
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%95Excellent
Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 5090 Mobile

Compare

Alternate

nvidia13824 GB

Operator-grade

NVIDIA GB200 NVL72

Strong alternate. With your $300 budget on Linux for coding agents, the NVIDIA GB200 NVL72 ranks here because with 13824 GB and ~8000 GB/s memory bandwidth, this clears the VRAM bar for coding agents comfortably.

Realistic model class

Frontier-scale — 100B+ MoE territory

Expected throughput

Datacenter throughput regime — measure with your real workload.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA GB200 NVL72 →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%100Excellent
Budget fitweight 18%50Acceptable
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%5Poor
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%75Strong
Gaming alignmentweight 6%85Strong
Perf-per-wattweight 6%25Weak

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Sustained ~120000W — plan for a 1000W+ PSU and adequate case airflow.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GB200 NVL72

Compare

Alternate

nvidia16 GB

Operator-grade

NVIDIA GeForce RTX 4090 Mobile

Strong alternate. With your $300 budget on Linux for coding agents, the NVIDIA GeForce RTX 4090 Mobile sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class

Qwen 2.5 Coder 14B FP16, agents OK

Expected throughput

40-70 tok/s on 7B Q4; 20-35 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA GeForce RTX 4090 Mobile →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%33Weak
Budget fitweight 18%50Acceptable
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%95Excellent
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%95Excellent
Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•16 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 4090 Mobile

Compare

Alternate

nvidia16 GB

Operator-grade

NVIDIA GeForce RTX 3080 16GB (Mobile)

Strong alternate. With your $300 budget on Linux for coding agents, the NVIDIA GeForce RTX 3080 16GB (Mobile) sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class

Qwen 2.5 Coder 14B FP16, agents OK

Expected throughput

40-70 tok/s on 7B Q4; 20-35 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

1benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

Moderate

1 cohort

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

Only 1 benchmark — below the 5-row threshold for cohort signal.

Help us measure NVIDIA GeForce RTX 3080 16GB (Mobile) →

Measured throughput

top 1 of 1 on file · most recent first

ed
qwen 2.5 coder 7b instructQ4_K_M
79.4tok/s2026-05

Show 1 benchmark feeding this card▸

ed
#337qwen-2.5-coder-7b-instruct · Q4_K_M
79.4 tok/s2026-05-10

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%33Weak
Budget fitweight 18%50Acceptable
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%95Excellent
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%90Excellent
Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•16 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 3080 16GB (Mobile)

Compare

Alternate

intel16 GB~$269

Operator-grade

Intel Arc A770 16GB

Strong alternate. With your $300 budget on Linux for coding agents, the Intel Arc A770 16GB sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class

Qwen 2.5 Coder 14B FP16, agents OK

Expected throughput

Single-stream chat in the 20-50 tok/s range on 7B-class Q4. Higher tiers unrealistic.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure Intel Arc A770 16GB →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%33Weak
Budget fitweight 18%95Excellent
OS compatibilityweight 16%70Good
Skill matchweight 10%55Acceptable
Power headroomweight 8%80Strong
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%70Good
Perf-per-wattweight 6%85Strong

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•16 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.
•Intel discrete GPU AI tooling lags NVIDIA — runtime support exists (IPEX-LLM) but documentation is thinner.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for Intel Arc A770 16GB

Compare

Alternate

amd128 GB

Operator-grade

AMD Instinct MI300A (APU)

Strong alternate. With your $300 budget on Linux for coding agents, the AMD Instinct MI300A (APU) ranks here because with 128 GB and ~? GB/s memory bandwidth, this clears the VRAM bar for coding agents comfortably.

Realistic model class

Frontier-scale — 100B+ MoE territory

Expected throughput

Datacenter throughput regime — measure with your real workload.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure AMD Instinct MI300A (APU) →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%100Excellent
Budget fitweight 18%50Acceptable
OS compatibilityweight 16%70Good
Skill matchweight 10%60Good
Power headroomweight 8%5Poor
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%75Strong
Gaming alignmentweight 6%85Strong
Perf-per-wattweight 6%25Weak

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Sustained ~760W — plan for a 1000W+ PSU and adequate case airflow.

Try in custom builder →See model-fit table Recommended runtime: llama-cpp

Estimated(rule-based scoring)Help us measure this — submit a benchmark for AMD Instinct MI300A (APU)

Worth saving up for

1 card over budget by ≤25% — top-tier on every dimension except price

Worth saving up for

nvidia22 GB~$350

Operator-grade

NVIDIA RTX 2080 Ti 22GB (China-mod)

Out of budget for this query. With your $300 budget on Linux for coding agents, the NVIDIA RTX 2080 Ti 22GB (China-mod) ranks here because street price around $350 sits above your $300 budget — listed for the upgrade-path conversation, not as a recommendation.

17% over budget ($50 extra). On every dimension other than price this card would have landed top-tier — surfaced here so the upgrade path is visible.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA RTX 2080 Ti 22GB (China-mod) →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%46Acceptable
Budget fitweight 18%35Weak
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%80Strong
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%70Good
Perf-per-wattweight 6%85Strong

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Out of budget — street price around $350 vs your $300 budget.
•22 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA RTX 2080 Ti 22GB (China-mod)

Why we ruled these out

Over-budget or fundamentally incompatible — listed for the upgrade-path conversation

NVIDIA H100 PCIe — Out of budget for this query.
~$25,000
NVIDIA RTX 5000 PRO Blackwell 48GB — Out of budget for this query.
~$5,499
NVIDIA RTX A6000 (Ampere) — Out of budget for this query.
~$3,500

Where to go from here

Live GPU price tracker →

Multi-store, multi-region prices for every card here. US/EU/UK/CA/AU — see what these cards actually cost in your region before you buy.

Stack Builder →

One step further: this card + runtime + 1-3 models + cost rollup + ready-to-paste install script. Eight inputs → full rig.

Custom build engine →

Once you’ve picked a card, model the full build (CPU, RAM, runtime) for which models fit comfortably.

GPU buying guide 2026 →

The long-form essay version: VRAM tiers, MoE math, NVLink truth, used-market price discipline.

Hardware combinations →

Curated multi-GPU and Apple-cluster setups with effective-VRAM math you can trust.

Scoring methodology →

How the trust layer behind these recommendations actually works — every dimension, every formula, the honest limits.

Cohort coverage report →

Where the intelligence graph has signal vs which model × hardware × quant cohorts are still underpowered.

Reproduce a benchmark →

Help tip a cohort across the 5-row threshold for outlier detection — the most operator-impactful contribution.