Choose my GPU

Answer nine questions. We rank the GPUs in our catalog by fit for local AI on your stack — top picks, alternates, and what to avoid. Hand-written rationale per card, honest caveats, and a one-click handoff into the custom build engine.

We don’t fake tok/s numbers. Every recommendation cites a model class and a workload-realistic range. Cards over your budget appear last with explicit framing. Recommendations are rule-based scoring, not measured benchmarks.

Tell us about your build

URL updates as you change fields.

Budget?

Operating system?

Primary workload?

Your skill level?

System power limit?

Loudness tolerance?

Gaming priority?

Electricity sensitivity?

Plan to add a 2nd / 3rd GPU later?

Display currency?

Price vs performance (budget-neutral)

12 cards

Top pick (8)

Worth saving up for (1)

Avoid (3)

Top picks

8 cards matching your stack tightly

Tick two cards to compare side-by-side

Compare

Top pick

nvidia24 GB~$1,500

Operator-grade

NVIDIA RTX PRO 4000 Blackwell

Top pick for your setup. With your $2,500 budget on Linux for coding agents, the NVIDIA RTX PRO 4000 Blackwell ranks here because 24 GB hits the workable band for coding agents — fits at sensible quants without becoming the bottleneck.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA RTX PRO 4000 Blackwell →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%70Good
Budget fitweight 18%95Excellent
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%95Excellent
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%85Strong
Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA RTX PRO 4000 Blackwell

Compare

Top pick

nvidia24 GB~$2,500

Operator-grade

NVIDIA L4

Top pick for your setup. With your $2,500 budget on Linux for coding agents, the NVIDIA L4 sits in this tier on a balance of capability, OS compat, power, and budget fit.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA L4 →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%65Good
Budget fitweight 18%85Strong
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%95Excellent
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%85Strong
Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA L4

Compare

Top pick

nvidia48 GB~$2,400

Operator-grade

NVIDIA RTX 4090 48GB (China-mod)

Top pick for your setup. With your $2,500 budget on Linux for coding agents, the NVIDIA RTX 4090 48GB (China-mod) ranks here because with 48 GB and ~1008 GB/s memory bandwidth, this clears the VRAM bar for coding agents comfortably.

Realistic model class

70B Q4 territory

Expected throughput

20-40 tok/s on 70B Q4 single-stream; 50-90 tok/s on 32B FP16.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA RTX 4090 48GB (China-mod) →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%98Excellent
Budget fitweight 18%85Strong
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%25Weak
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%85Strong
Perf-per-wattweight 6%65Good

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Sustained ~450W — plan for a 1000W+ PSU and adequate case airflow.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA RTX 4090 48GB (China-mod)

Compare

Top pick

nvidia24 GB~$2,500

Operator-grade

NVIDIA RTX A5000

Top pick for your setup. With your $2,500 budget on Linux for coding agents, the NVIDIA RTX A5000 ranks here because 24 GB hits the workable band for coding agents — fits at sensible quants without becoming the bottleneck.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA RTX A5000 →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%70Good
Budget fitweight 18%85Strong
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%80Strong
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%85Strong
Perf-per-wattweight 6%85Strong

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA RTX A5000

Compare

Top pick

nvidia24 GB~$899Estimated(used-market price)

Operator-grade

NVIDIA GeForce RTX 3090

Top pick for your setup. With your $2,500 budget on Linux for coding agents, the NVIDIA GeForce RTX 3090 ranks here because 24 GB hits the workable band for coding agents — fits at sensible quants without becoming the bottleneck.

Sustained 450W+ — minimum 1000W Gold PSU + good airflow. Your power tolerance is moderate (350W ceiling), which this card will exceed under load.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA GeForce RTX 3090 →

Featured in stacks

Dual RTX 3090 workstation stack — 70B-class on $1,800 of used GPUs — Workstation · GPUs (2× 24GB used, the cheapest path to 48 GB total)
Quad RTX 3090 workstation stack — the prosumer 100B-class ceiling — Homelab · GPUs (4× 24GB used; the prosumer-ceiling stack)

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%70Good
Budget fitweight 18%80Strong
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%80Strong
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%95Excellent
Perf-per-wattweight 6%85Strong

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Used-market only — fan/thermal-pad inspection required; new MSRP from launch is no longer the relevant price.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 3090

Featured in these stacks

Dual RTX 3090 workstation stack — 70B-class on $1,800 of used GPUs — Workstation tier · GPUs (2× 24GB used, the cheapest path to 48 GB total)
Quad RTX 3090 workstation stack — the prosumer 100B-class ceiling — Homelab tier · GPUs (4× 24GB used; the prosumer-ceiling stack)

Compare

Top pick

nvidia24 GB~$1,899

Operator-grade

NVIDIA GeForce RTX 4090

Top pick for your setup. With your $2,500 budget on Linux for coding agents, the NVIDIA GeForce RTX 4090 ranks here because 24 GB hits the workable band for coding agents — fits at sensible quants without becoming the bottleneck.

Sustained 450W+ — minimum 1000W Gold PSU + good airflow. Your power tolerance is moderate (350W ceiling), which this card will exceed under load.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA GeForce RTX 4090 →

Featured in stacks

Build a local coding-agent stack (May 2026) — Workstation · GPU (where the model runs)
Build an RTX 4090 AI workstation stack (May 2026) — Workstation · GPU (the hardware that defines this stack)

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%73Good
Budget fitweight 18%95Excellent
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%25Weak
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%95Excellent
Perf-per-wattweight 6%65Good

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Sustained ~450W — plan for a 1000W+ PSU and adequate case airflow.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 4090

Featured in these stacks

Build a local coding-agent stack (May 2026) — Workstation tier · GPU (where the model runs)
Build an RTX 4090 AI workstation stack (May 2026) — Workstation tier · GPU (the hardware that defines this stack)

Compare

Top pick

nvidia32 GB~$2,499

Operator-grade

NVIDIA GeForce RTX 5090

Top pick for your setup. With your $2,500 budget on Linux for coding agents, the NVIDIA GeForce RTX 5090 ranks here because with 32 GB and ~1792 GB/s memory bandwidth, this clears the VRAM bar for coding agents comfortably.

Sustained 450W+ — minimum 1000W Gold PSU + good airflow. Your power tolerance is moderate (350W ceiling), which this card will exceed under load.

Realistic model class

Qwen 2.5 Coder 32B Q5, 70B Q3 reach

Expected throughput

40-70 tok/s on 32B Q4; 15-25 tok/s on 70B Q3 with offload.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA GeForce RTX 5090 →

Pending benchmark opportunities

all →

Single RTX 5090 + Qwen 3 Coder 32B (vLLM, AWQ-INT4)
critical

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%93Excellent
Budget fitweight 18%85Strong
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%5Poor
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%75Strong
Gaming alignmentweight 6%95Excellent
Perf-per-wattweight 6%40Acceptable

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Sustained ~575W — plan for a 1000W+ PSU and adequate case airflow.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 5090

Compare

Top pick

nvidia24 GB~$1,199Estimated(used-market price)

Operator-grade

NVIDIA GeForce RTX 3090 Ti

Top pick for your setup. With your $2,500 budget on Linux for coding agents, the NVIDIA GeForce RTX 3090 Ti ranks here because 24 GB hits the workable band for coding agents — fits at sensible quants without becoming the bottleneck.

Sustained 450W+ — minimum 1000W Gold PSU + good airflow. Your power tolerance is moderate (350W ceiling), which this card will exceed under load.

Realistic model class

Qwen 2.5 Coder 32B Q4 + 32K context

Expected throughput

30-60 tok/s on 32B Q4 single-stream; 80-130 tok/s on 13B Q4.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA GeForce RTX 3090 Ti →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%73Good
Budget fitweight 18%80Strong
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%25Weak
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%95Excellent
Perf-per-wattweight 6%65Good

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Sustained ~450W — plan for a 1000W+ PSU and adequate case airflow.
•Used-market only — fan/thermal-pad inspection required; new MSRP from launch is no longer the relevant price.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA GeForce RTX 3090 Ti

Worth saving up for

1 card over budget by ≤25% — top-tier on every dimension except price

Worth saving up for

nvidia32 GB~$2,600

Operator-grade

NVIDIA RTX PRO 4500 Blackwell

Out of budget for this query. With your $2,500 budget on Linux for coding agents, the NVIDIA RTX PRO 4500 Blackwell ranks here because street price around $2,600 sits above your $2,500 budget — listed for the upgrade-path conversation, not as a recommendation. With 32 GB and ~896 GB/s memory bandwidth, this clears the VRAM bar for coding agents comfortably.

4% over budget ($100 extra). On every dimension other than price this card would have landed top-tier — surfaced here so the upgrade path is visible.

Realistic model class

Qwen 2.5 Coder 32B Q5, 70B Q3 reach

Expected throughput

40-70 tok/s on 32B Q4; 15-25 tok/s on 70B Q3 with offload.

Evidence

live data · editorial + reproduced community

Editorial

0benchmarks

Reproduced

0community

Stale (>18mo)

0rows

Cohort confidence

—

none

Needs measurement

This recommendation is rule-based, not evidence-backed yet.

No benchmarks on file for this hardware.

Help us measure NVIDIA RTX PRO 4500 Blackwell →

How we scored this card▸

Each dimension is a 0-100 score. The card's position in the ranking is the weighted sum — but we surface tiers, not raw numbers. Bars are sorted by weight (most-influential first).

VRAM × workloadweight 22%90Excellent
Budget fitweight 18%35Weak
OS compatibilityweight 16%100Excellent
Skill matchweight 10%95Excellent
Power headroomweight 8%95Excellent
Multi-GPU pathweight 8%80Strong
Thermal / noiseweight 6%95Excellent
Gaming alignmentweight 6%85Strong
Perf-per-wattweight 6%95Excellent

Tier mapping: top ≥ 75 composite · alternate 60-74 · acceptable 40-59 · avoid < 40 or over-budget / incompatible.

Caveats

•Out of budget — street price around $2,600 vs your $2,500 budget.

Try in custom builder →See model-fit table Recommended runtime: ollama

Estimated(rule-based scoring)Help us measure this — submit a benchmark for NVIDIA RTX PRO 4500 Blackwell

Why we ruled these out

Over-budget or fundamentally incompatible — listed for the upgrade-path conversation

NVIDIA RTX A6000 (Ampere) — Out of budget for this query.
~$3,500
NVIDIA H100 PCIe — Out of budget for this query.
~$25,000
NVIDIA RTX 5000 Ada Generation — Out of budget for this query.
~$4,000

Where to go from here

Live GPU price tracker →

Multi-store, multi-region prices for every card here. US/EU/UK/CA/AU — see what these cards actually cost in your region before you buy.

Stack Builder →

One step further: this card + runtime + 1-3 models + cost rollup + ready-to-paste install script. Eight inputs → full rig.

Custom build engine →

Once you’ve picked a card, model the full build (CPU, RAM, runtime) for which models fit comfortably.

GPU buying guide 2026 →

The long-form essay version: VRAM tiers, MoE math, NVLink truth, used-market price discipline.

Hardware combinations →

Curated multi-GPU and Apple-cluster setups with effective-VRAM math you can trust.

Scoring methodology →

How the trust layer behind these recommendations actually works — every dimension, every formula, the honest limits.

Cohort coverage report →

Where the intelligence graph has signal vs which model × hardware × quant cohorts are still underpowered.

Reproduce a benchmark →

Help tip a cohort across the 5-row threshold for outlier detection — the most operator-impactful contribution.

Configure your build

Tell us about your build

Price versus performance

Top GPU picks for your build

Worth saving up for

GPUs we ruled out

Where to go from here