Choose my GPU

Answer nine questions. We rank the GPUs in our catalog by fit for local AI on your stack — top picks, alternates, and what to avoid. Hand-written rationale per card, honest caveats, and a one-click handoff into the custom build engine.

We don’t fake tok/s numbers. Every recommendation cites a model class and a workload-realistic range. Cards over your budget appear last with explicit framing. Recommendations are rule-based scoring, not measured benchmarks.

Tell us about your build

URL updates as you change fields.

Budget?

Operating system?

Primary workload?

Your skill level?

System power limit?

Loudness tolerance?

Gaming priority?

Electricity sensitivity?

Plan to add a 2nd / 3rd GPU later?

Display currency?

Price vs performance (budget-neutral)

5 cards · 3 skipped (no price)

Top pick (1)

Alternate (3)

Worth saving up for (1)

Avoid (3)

Top picks

1 card matching your stack tightly · 3 alternates

Tick two cards to compare side-by-side

Compare

Top pick

nvidia24 GB

Operator-grade

NVIDIA GeForce RTX 5090 Mobile

Top pick for your setup. With your $300 budget on Linux for coding agents, the NVIDIA GeForce RTX 5090 Mobile ranks here because 24 GB hits the workable band for coding agents — fits at sensible quants without becoming the bottleneck.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA GeForce RTX 5090 Mobile →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%70Good
Budget fitweight 18%40Acceptable
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%95Excellent
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%95Excellent
Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 5090 Mobile

Compare

Alternate

nvidia16 GB

Operator-grade

NVIDIA GeForce RTX 4090 Mobile

Strong alternate. With your $300 budget on Linux for coding agents, the NVIDIA GeForce RTX 4090 Mobile sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class

Qwen 2.5 Coder 14B FP16, agents OK

Expected throughput

40-70 tok/s on 7B Q4; 20-35 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA GeForce RTX 4090 Mobile →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%33Weak
Budget fitweight 18%40Acceptable
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%95Excellent
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%95Excellent
Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•16 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 4090 Mobile

Compare

Alternate

nvidia16 GB

Operator-grade

NVIDIA GeForce RTX 3080 16GB (Mobile)

Strong alternate. With your $300 budget on Linux for coding agents, the NVIDIA GeForce RTX 3080 16GB (Mobile) sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class

Qwen 2.5 Coder 14B FP16, agents OK

Expected throughput

40-70 tok/s on 7B Q4; 20-35 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

27benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

Low

27 cohorts

Measured throughput

top 3 of 10 on file · most recent first

ed
llama 3.2 1b instructQ4_K_M
189.5tok/s2026-06
ed
kumru 2bQ4_K_M
174.2tok/s2026-06
ed
trendyol llm asure 12bQ4_K_M
43.4tok/s2026-06

7 additional measurements below in the full breakdown.

Show 10 benchmarks feeding this card▸

ed
#364llama-3.2-1b-instruct · Q4_K_M
189.5 tok/s2026-06-02
ed
#365kumru-2b · Q4_K_M
174.2 tok/s2026-06-02
ed
#366trendyol-llm-asure-12b · Q4_K_M
43.4 tok/s2026-06-02
ed
#367ytu-turkish-gemma-9b · Q4_K_M
66.0 tok/s2026-06-02
ed
#368brooqs-mistral-turkish-v2-latest · Q4_K_M
106.8 tok/s2026-06-02
ed
#369codegemma-7b · Q4_K_M
80.6 tok/s2026-06-02
ed
#370deepseek-coder-v2-lite · Q4_K_M
152.0 tok/s2026-06-02
ed
#371deepseek-r1-distill-qwen-7b · Q4_K_M
80.3 tok/s2026-06-02
ed
#372gemma-2-9b-it · Q4_K_M
68.2 tok/s2026-06-02
ed
#373gemma-3-12b · Q4_K_M
43.3 tok/s2026-06-02

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%33Weak
Budget fitweight 18%40Acceptable
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%95Excellent
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%90Excellent
Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•16 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 3080 16GB (Mobile)

Compare

Alternate

intel16 GB~$269

Operator-grade

Intel Arc A770 16GB

Strong alternate. With your $300 budget on Linux for coding agents, the Intel Arc A770 16GB sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class

Qwen 2.5 Coder 14B FP16, agents OK

Expected throughput

Single-stream chat in the 20-50 tok/s range on 7B-class Q4. Higher tiers unrealistic.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure Intel Arc A770 16GB →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%33Weak
Budget fitweight 18%95Excellent
OS compatibilityweight 16%70Good
Skill matchweight 10%55Acceptable
Power headroomweight 8%80Strong
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%70Good
Perf-per-wattweight 6%85Strong

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•16 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.
•Intel discrete GPU AI tooling lags NVIDIA — runtime support exists (IPEX-LLM) but documentation is thinner.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for Intel Arc A770 16GB

Worth saving up for

1 card over budget by ≤25% — top-tier on every dimension except price

Worth saving up for

nvidia22 GB~$350

Operator-grade

NVIDIA RTX 2080 Ti 22GB (China-mod)

Out of budget for this query. With your $300 budget on Linux for coding agents, the NVIDIA RTX 2080 Ti 22GB (China-mod) ranks here because street price around $350 sits above your $300 budget — listed for the upgrade-path conversation, not as a recommendation.

17% over budget ($50 extra). On every dimension other than price this card would have landed top-tier — surfaced here so the upgrade path is visible.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA RTX 2080 Ti 22GB (China-mod) →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%46Acceptable
Budget fitweight 18%35Weak
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%80Strong
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%70Good
Perf-per-wattweight 6%85Strong

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Out of budget — street price around $350 vs your $300 budget.
•22 GB is below the comfortable VRAM minimum for coding agents — expect quant downgrades or very tight context windows.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA RTX 2080 Ti 22GB (China-mod)

Why we ruled these out

Over-budget or fundamentally incompatible — listed for the upgrade-path conversation

NVIDIA H100 PCIe — Out of budget for this query.
~$25,000
NVIDIA RTX PRO 4500 Blackwell — Out of budget for this query.
~$2,600
NVIDIA RTX 5000 PRO Blackwell 48GB — Out of budget for this query.
~$5,499

Where to go from here

Live GPU price tracker →

Multi-store, multi-region prices for every card here. US/EU/UK/CA/AU — see what these cards actually cost in your region before you buy.

Stack Builder →

One step further: this card + runtime + 1-3 models + cost rollup + ready-to-paste install script. Eight inputs → full rig.

Custom build engine →

Once you’ve picked a card, model the full build (CPU, RAM, runtime) for which models fit comfortably.

GPU buying guide 2026 →

The long-form essay version: VRAM tiers, MoE math, NVLink truth, used-market price discipline.

Hardware combinations →

Curated multi-GPU and Apple-cluster setups with effective-VRAM math you can trust.

Scoring methodology →

How the trust layer behind these recommendations actually works — every dimension, every formula, the honest limits.

Cohort coverage report →

Where the intelligence graph has signal vs which model × hardware × quant cohorts are still underpowered.

Reproduce a benchmark →

Help tip a cohort across the 5-row threshold for outlier detection — the most operator-impactful contribution.

Configure your build

Tell us about your build

Price versus performance

Top GPU picks for your build

NVIDIA GeForce RTX 5090 Mobile

NVIDIA GeForce RTX 4090 Mobile

NVIDIA GeForce RTX 3080 16GB (Mobile)

Intel Arc A770 16GB

Worth saving up for

NVIDIA RTX 2080 Ti 22GB (China-mod)

GPUs we ruled out

Where to go from here