RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Hardware
  4. /Hierarchy
GPU hierarchy for local AI

Every GPU ranked for local AI inference

One screen. Every catalog GPU sorted by tier, with estimated tok/s for the four canonical model sizes (7B, 14B, 32B, 70B at Q4_K_M). Measurements where we have them, bandwidth-derived estimates where we don't — every cell labeled so you know what you're reading. Methodology at /methodology.

A-tier
5
Consumer flagship
B-tier
13
Enthusiast
C-tier
22
Mid-range
D-tier
42
Budget
M-tier
8
Mobile
S-tier
38
Workstation / Datacenter
E-tier
21
Apple Silicon / Edge
TierGPUVRAMBW (GB/s)Price7B Q414B Q432B Q470B Q4Rating
AApple Mac Studio (M3 Ultra)
apple · 2025
0 GB800$4,999????10.0
AMacBook Pro 16" M4 Max
apple · 2024
0 GB546$3,999????10.0
ANVIDIA GeForce RTX 5090
nvidia · 2025
32 GB1792$2,499199–279105–14846–64—9.6
ANVIDIA GeForce RTX 4090
nvidia · 2022
24 GB1008$1,899112–15759–8326–36—9.4
AApple Mac Studio (M4 Max)
apple · 2025
? GB546$1,999????8.7
BLenovo Legion 5 Pro Gen 7 (RTX 3080 16GB)
nvidia · 2022
16 GB?$1,499????9.3
BApple Mac Mini (M4 Pro)
apple · 2024
? GB273$1,399????8.9
BNVIDIA GeForce RTX 3090 Ti
nvidia · 2022
24 GB1008$1,199112–15759–8326–36—8.8
BNVIDIA GeForce RTX 3090
nvidia · 2020
24 GB936$899104–14655–7724–34—8.5
BNVIDIA GeForce RTX 5080
nvidia · 2025
16 GB960$1,199136●82●——8.1
BNVIDIA GeForce RTX 5070 Ti
nvidia · 2025
16 GB896$849100–13953–74——8.1
BNVIDIA GeForce RTX 4070 Ti Super
nvidia · 2024
16 GB672$82975–10540–55——8.1
BApple MacBook Air (M4)
apple · 2025
? GB120$999????8.0
BAMD Radeon RX 7900 XTX
amd · 2022
24 GB960$899107–14956–7925–34—7.8
BNVIDIA GeForce RTX 4080
nvidia · 2022
16 GB717$1,09980–11242–59——7.8
BNVIDIA GeForce RTX 3080 12GB
nvidia · 2022
12 GB912$449101–14254–75——7.3
BNVIDIA GeForce RTX 3080 Ti
nvidia · 2021
12 GB912$480101–14254–75——7.3
BNVIDIA GeForce RTX 4080 Super
nvidia · 2024
16 GB736$1,09982–11443–61——7.2
CApple Mac Mini (M4)
apple · 2024
? GB120$599????8.4
CAMD Radeon RX 7900 XT
amd · 2022
20 GB800$72989–12447–66——8.1
CNVIDIA GeForce RTX 5060 Ti 16GB
nvidia · 2025
16 GB448$45950–7026–37——8.1
CAMD Radeon RX 9070 XT
amd · 2025
16 GB644$64972–10038–53——7.9
CAMD Radeon RX 9070
amd · 2025
16 GB624$56969–9737–51——7.9
CAMD Radeon RX 7900 GRE
amd · 2024
16 GB576$54964–9034–47——7.9
CNVIDIA GeForce RTX 4060 Ti 16GB
nvidia · 2023
16 GB288$44932–4517–24——7.8
CNVIDIA GeForce RTX 5070
nvidia · 2025
12 GB672$59975–10540–55——7.6
CAMD Radeon RX 7800 XT
amd · 2023
16 GB624$45969–9737–51——7.6
CAMD Radeon RX 6950 XT
amd · 2022
16 GB576$58064–9034–47——7.6
CNVIDIA GeForce RTX 4070 Super
nvidia · 2024
12 GB504$61956–7830–42——7.6
CAMD Radeon RX 9060 XT
amd · 2026
16 GB640$44971–10038–53——7.5
CAMD Radeon RX 6800
amd · 2020
16 GB512$38057–8030–42——7.3
CAMD Radeon RX 6800 XT
amd · 2020
16 GB512$45057–8030–42——7.3
CAMD Radeon RX 6900 XT
amd · 2020
16 GB512$50057–8030–42——7.3
CNVIDIA GeForce RTX 4070
nvidia · 2023
12 GB504$54956–7830–42——7.3
CNVIDIA GeForce RTX 4070 Ti
nvidia · 2023
12 GB504$74956–7830–42——7.3
CAMD Radeon RX 9070 GRE
amd · 2026
12 GB432$54948–6725–36——7.0
CNVIDIA GeForce RTX 2080 Ti
nvidia · 2018
11 GB616$38068–9636–51——6.6
CNVIDIA GeForce RTX 3080 10GB
nvidia · 2020
10 GB760$37984–118———6.5
CIntel Arc A770 16GB
intel · 2022
16 GB559$26962–8733–46——6.5
CNVIDIA GeForce RTX 3070 Ti
nvidia · 2021
8 GB608$35068–95———5.0
DAMD Radeon RX 7600 XT
amd · 2024
16 GB288$30932–4517–24——7.9
DAMD Radeon RX 6750 XT
amd · 2022
12 GB432$32048–6725–36——7.1
DAMD Radeon RX 7700 XT
amd · 2023
12 GB432$37948–6725–36——7.1
DNVIDIA GeForce RTX 3060 12GB
nvidia · 2021
12 GB360$24940–5621–30——7.0
DAMD Radeon RX 6700 XT
amd · 2021
12 GB384$28043–6023–32——6.8
DNVIDIA GeForce GTX 1080 Ti
nvidia · 2017
11 GB484$25054–7528–40——6.6
DNVIDIA GeForce RTX 5050
nvidia · 2025
8 GB320$24936–50———6.4
DIntel Arc B580
intel · 2024
12 GB456$26951–7127–38——6.3
DIntel Arc B570
intel · 2025
10 GB380$21942–59———5.8
DNVIDIA GeForce RTX 5060
nvidia · 2025
8 GB448$29950–70———5.6
DNVIDIA GeForce RTX 5060 Ti 8GB
nvidia · 2025
8 GB448$37950–70———5.6
DNVIDIA GeForce RTX 4060 Ti 8GB
nvidia · 2023
8 GB288$36932–45———5.3
DNVIDIA GeForce RTX 4060
nvidia · 2023
8 GB272$27930–42———5.3
DNVIDIA GeForce RTX 3050
nvidia · 2022
8 GB224$20025–35———5.3
DNVIDIA GeForce RTX 2080 Super
nvidia · 2019
8 GB496$32055–77———5.1
DNVIDIA GeForce RTX 2070
nvidia · 2018
8 GB448$24050–70———5.1
DAMD Radeon RX 6650 XT
amd · 2022
8 GB280$23031–44———5.1
DNVIDIA GeForce GTX 1070 Ti
nvidia · 2017
8 GB256$16028–40———5.1
DNVIDIA GeForce RTX 3060 Ti
nvidia · 2020
8 GB448$28050–70———5.0
DNVIDIA GeForce RTX 3070
nvidia · 2020
8 GB448$26950–70———5.0
DNVIDIA GeForce RTX 2060 Super
nvidia · 2019
8 GB448$22050–70———4.8
DNVIDIA GeForce RTX 2070 Super
nvidia · 2019
8 GB448$28050–70———4.8
DAMD Radeon RX 6600 XT
amd · 2021
8 GB256$20028–40———4.8
DAMD Radeon RX 6600
amd · 2021
8 GB224$18025–35———4.8
DNVIDIA GeForce GTX 1080
nvidia · 2016
8 GB320$18036–50———4.6
DNVIDIA GeForce GTX 1070
nvidia · 2016
8 GB256$14028–40———4.6
DAMD Radeon RX 580 8GB
amd · 2017
8 GB256$8028–40———3.8
DAMD Radeon RX 5700 XT
amd · 2019
8 GB448$20050–70———3.5
DAMD Radeon RX 5500 XT 8GB
amd · 2019
8 GB224$11025–35———3.5
DNVIDIA GeForce GTX 1660 Super
nvidia · 2019
6 GB336$15037–52———2.8
DNVIDIA GeForce RTX 2060
nvidia · 2019
6 GB336$18037–52———2.8
DNVIDIA GeForce GTX 1660 Ti
nvidia · 2019
6 GB288$16032–45———2.8
DNVIDIA GeForce GTX 1660
nvidia · 2019
6 GB192$13021–30———2.8
DNVIDIA GeForce GTX 1060 6GB
nvidia · 2016
6 GB192$11021–30———2.6
DAMD Radeon 880M (Strix Point iGPU)
amd · 2024
0 GB102?????2.4
DAMD Radeon 780M (Phoenix iGPU)
amd · 2023
0 GB89?????2.1
DNVIDIA GeForce GTX 1650 Super
nvidia · 2019
4 GB192$140————1.8
DNVIDIA GeForce GTX 1650
nvidia · 2019
4 GB128$130————1.8
DAMD Radeon RX 5600 XT
amd · 2020
6 GB336$14037–52———1.7
DNVIDIA GeForce GTX 1050 Ti
nvidia · 2016
4 GB112$90————1.3
DNVIDIA GeForce GTX 1060 3GB
nvidia · 2016
3 GB192$70————1.1
DAMD Radeon RX 570
amd · 2017
4 GB224$60————1.0
MASUS ROG Strix Scar 18 (RTX 5090 Mobile)
nvidia · 2025
24 GB?$3,999????9.6
MRazer Blade 16 (2025, RTX 5090 Mobile)
nvidia · 2025
24 GB?$4,499????9.6
MFramework Laptop 16 (RX 7700S)
amd · 2024
8 GB?$1,699????8.9
MNVIDIA GeForce RTX 3080 16GB (Mobile)
nvidia · 2022
16 GB512?190●43●——8.8
MNVIDIA GeForce RTX 5090 Mobile
nvidia · 2025
24 GB896?100–13953–7423–32—8.6
MNVIDIA GeForce RTX 4090 Mobile
nvidia · 2023
16 GB576?64–9034–47——7.3
MNVIDIA GeForce RTX 5070 Laptop GPU
nvidia · 2025
12 GB384?43–6023–32——7.1
MNVIDIA GeForce RTX 3050 Ti (Mobile)
nvidia · 2021
4 GB192?————1.5
SAMD Instinct MI355X
amd · 2025
288 GB8000$25,000889–1244471–659205–28795–13310.0
SNVIDIA B200
nvidia · 2024
192 GB8000$40,000889–1244471–659205–28795–13310.0
SNVIDIA GB200 NVL72
nvidia · 2024
13824 GB8000?889–1244471–659205–28795–13310.0
SAMD Instinct MI325X
amd · 2024
256 GB6000$20,000667–933353–494154–21571–10010.0
SAMD Instinct MI300X
amd · 2023
192 GB5325$15,000592–828313–439137–19163–8910.0
SAMD Instinct MI300A (APU)
amd · 2023
128 GB5300?589–824312–436136–19063–8810.0
SNVIDIA H200
nvidia · 2024
141 GB4800$31,000533–747282–395123–17257–8010.0
SNVIDIA H100 NVL
nvidia · 2023
188 GB3938$60,000438–613232–324101–14147–6610.0
SNVIDIA H100 SXM
nvidia · 2022
80 GB3350$30,000372–521197–27686–12040–5610.0
SNVIDIA H100 PCIe
nvidia · 2022
80 GB2039$25,000227–317120–16852–7324–3410.0
SNVIDIA RTX PRO 6000 Blackwell
nvidia · 2025
96 GB1792$8,999199–279105–14846–6421–3010.0
SNVIDIA RTX 6000 Ada Generation
nvidia · 2022
48 GB960$6,499107–14956–7925–3411–1610.0
SNVIDIA L40
nvidia · 2022
48 GB864$8,00096–13451–7122–3110–1410.0
SNVIDIA L40S
nvidia · 2023
48 GB864$8,50096–13451–7122–3110–1410.0
SNVIDIA DGX Spark (Project Digits)
nvidia · 2025
0 GB?$3,000????10.0
SAMD Instinct MI210
amd · 2022
64 GB1638$8,500182–25596–13542–5920–279.8
SAMD Instinct MI250X
amd · 2021
128 GB3277$13,000364–510193–27084–11839–559.7
SNVIDIA A100 80GB SXM
nvidia · 2020
80 GB2039$17,000227–317120–16852–7324–349.7
SNVIDIA RTX A6000 (Ampere)
nvidia · 2020
48 GB768$3,50085–11945–6320–289–139.7
SNVIDIA A40
nvidia · 2020
48 GB696$5,50077–10841–5718–258–129.7
SNVIDIA RTX 5000 Ada Generation
nvidia · 2023
32 GB576$4,00064–9034–4715–21—9.5
SNVIDIA B300 (Blackwell Ultra)
nvidia · 2025
288 GB8000?889–1244471–659205–28795–1339.2
SNVIDIA A100 40GB
nvidia · 2020
40 GB1555$11,000173–24291–12840–56—9.2
SNVIDIA L4
nvidia · 2023
24 GB300$2,50033–4718–258–11—9.0
SNVIDIA RTX A5000
nvidia · 2021
24 GB768$2,50085–11945–6320–28—8.7
SNVIDIA RTX 5000 PRO Blackwell 48GB
nvidia · 2026
48 GB960$5,499107–14956–7925–3411–168.5
SAMD Instinct MI350X
amd · 2025
288 GB8000?889–1244471–659205–28795–1338.3
SIntel Gaudi 3
intel · 2024
128 GB3700$18,000411–576218–30595–13344–628.2
SFramework Desktop (Ryzen AI Max+ 395)
amd · 2025
? GB256$1,999????8.2
SASUS Ascent GX10 (NVIDIA GB10)
nvidia · 2025
? GB273$2,999????8.1
SGMKtec EVO-X2 (Ryzen AI Max+ 395)
amd · 2025
? GB256$1,499????8.0
SIntel Gaudi 2
intel · 2022
96 GB2450$8,000272–381144–20263–8829–417.9
SHP ZBook Ultra G1a (Ryzen AI Max+ PRO 395)
amd · 2025
? GB256$3,999????7.8
SAMD EPYC 9005 (Zen 5, Turin)
amd · 2024
? GB614?????7.7
SIntel Arc Pro B60 24GB
intel · 2025
24 GB456$59951–7127–3812–16—7.6
SNVIDIA RTX PRO 4500 Blackwell
nvidia · 2025
32 GB896$2,600100–13953–7423–32—7.5
SNVIDIA H20 (96GB)
nvidia · 2024
96 GB4000?444–622235–329103–14448–677.4
SNVIDIA RTX PRO 4000 Blackwell
nvidia · 2025
24 GB672$1,50075–10540–5517–24—7.3
EApple M4 Ultra
apple · 2025
0 GB1100?????10.0
EApple M3 Ultra
apple · 2025
0 GB800?????10.0
EApple M4 Max
apple · 2024
0 GB546?????10.0
EApple M4 Pro
apple · 2024
0 GB273?????10.0
EApple M1 Ultra
apple · 2022
0 GB800?????9.9
EApple M2 Ultra
apple · 2023
0 GB800?????9.9
EApple M2 Max
apple · 2023
0 GB400?????9.7
EApple M1 Max
apple · 2021
0 GB400?????8.9
EApple M3 Max
apple · 2023
0 GB400?????8.5
EQualcomm Snapdragon X Elite
qualcomm · 2024
0 GB??????7.3
EQualcomm Snapdragon X2 Elite
qualcomm · 2026
? GB??????6.9
EIntel Core Ultra 300 (Panther Lake)
intel · 2026
? GB??????6.8
EQualcomm Snapdragon X Plus
qualcomm · 2024
0 GB??????5.8
EQualcomm Snapdragon 8 Elite
qualcomm · 2024
0 GB90?????5.3
EApple M4 (iPad Pro)
apple · 2024
0 GB120?????5.0
EApple A18 Pro
apple · 2024
0 GB60?????5.0
EGoogle Tensor G4
google · 2024
0 GB60?????4.8
EApple A17 Pro
apple · 2023
0 GB51?????4.7
EQualcomm Snapdragon 8 Gen 3
qualcomm · 2023
0 GB77?????4.5
EAMD Ryzen AI 9 HX 370 (Strix Point)
amd · 2024
0 GB90$1,599????3.9
EIntel Core Ultra 7 258V (Lunar Lake)
intel · 2024
0 GB136$1,199????3.8
How to read this table
79●Measured tok/s (operator or community)130–180Estimated from bandwidth (50-70% efficiency)—Model doesn't fit at this card's VRAM

Estimates use the formula tok/s ≈ memory_bandwidth_GBps ÷ model_weights_GB × efficiency — the dominant constraint for autoregressive decode. The 50-70% efficiency band reflects realistic Ollama / llama.cpp / vLLM runtime overhead. See /methodology for the full derivation.

Got a rig? Run a benchmark and turn an estimate into a measured cell. Every measurement improves the table for the next reader.

Best value per model tier

Lowest $/tok-s pick for each model size

For each model tier, the catalog card with the lowest cost-per-tok/s among cards that fit. Computed from current street price ÷ estimated tok/s midpoint. A pick changing here is the live signal that prices or new hardware shifted the value frontier.

7B Q4
AMD Radeon RX 580 8GB
$80$2.34/tok·s
amd · 8GB · 256 GB/s
14B Q4
Intel Arc A770 16GB
$269$6.82/tok·s
intel · 16GB · 559 GB/s
32B Q4
AMD Radeon RX 7900 XTX
$899$30.43/tok·s
amd · 24GB · 960 GB/s
70B Q4
AMD Instinct MI300X
$15,000$197.18/tok·s
amd · 192GB · 5325 GB/s

Caveat: $/tok-s is a derived estimate stacked on the bandwidth formula. For workloads where you have a measured benchmark in the table above, trust the measured number first; for unmeasured combinations, this is the ranked best-guess for buyer decisions.

Choosing a GPU for your workload

The hierarchy answers "which is fastest" — but the right card for you depends on which model size you actually want to run. The four most common operator decisions:

  • 7B Q4 (autocomplete, single-model chat) — any card with ≥6 GB VRAM works. The decision shifts to price + power draw + cross-vendor preference. Top D-tier cards (Arc B580, RTX 3060) deliver useful tok/s at <$300.
  • 14B Q4 (coding assistant, mid-size chat) — ≥11 GB VRAM minimum. C-tier (RTX 4070 / RX 7800 XT) is the value sweet spot at $400-600.
  • 32B Q4 (full coding agent, multi-model) — ≥22 GB VRAM. B-tier 24 GB cards are the canonical buy: RTX 3090 used, RX 7900 XTX new, RTX 4090 if budget allows.
  • 70B Q4 (frontier-class local) — ≥48 GB VRAM. Single-card: RTX 6000 Ada / L40S / Mac Studio M3 Ultra. Multi-card: dual 3090 / dual 4090. Workstation tier or above.

Need it more personalized? Use /choose-my-gpu for a 9-input recommender, or /will-it-run to validate a specific model + GPU combination.

BLK · BUY · AMAZON
Shop GPUs & AI hardware on Amazon:GPU category·RTX 4090·RTX 5090·Apple M-series·AI mini-PCs

Amazon search links — we may earn a small commission at no extra cost to you. How we make money.