Every catalog hardware unit ranked by composite score (0–1000): measured tok/s, VRAM fit, ecosystem support, perf-per-watt. 2 of 154 ranks anchored to a measured benchmark — the rest are honestly flagged as extrapolated or estimated.
Methodology: /methodology · Run your own: curl -fsSL runlocalai.co/bench.mjs -o bench.mjs && node bench.mjs
24 units shown · sorted by score
| # | Hardware | Tier | Score | Data |
|---|---|---|---|---|
| 1 | NVIDIA H20 (96GB) nvidia · workstation · 96GB | B | 697 | Estimated |
| 2 | NVIDIA H200 NVL (PCIe) nvidia · workstation · 141GB | B | 684 | Estimated |
| 3 | NVIDIA B200 nvidia · workstation · 192GB | B | 684 | Estimated |
| 4 | NVIDIA H200 nvidia · workstation · 141GB | B | 676 | Estimated |
| 5 | NVIDIA B300 (Blackwell Ultra) nvidia · workstation · 288GB | B | 669 | Estimated |
| 6 | NVIDIA H100 NVL nvidia · workstation · 188GB | B | 663 | Estimated |
| 7 | NVIDIA H100 PCIe nvidia · workstation · 80GB | B | 662 | Estimated |
| 8 | NVIDIA A100 80GB SXM nvidia · workstation · 80GB | B | 657 | Estimated |
| 9 | NVIDIA H100 SXM nvidia · workstation · 80GB | B | 655 | Estimated |
| 10 | NVIDIA RTX PRO 6000 Blackwell nvidia · workstation · 96GB | B | 650 | Estimated |
| 11 | NVIDIA A100 40GB nvidia · workstation · 40GB | B | 635 | Estimated |
| 12 | NVIDIA GB200 NVL72 nvidia · workstation · 13824GB | B | 631 | Estimated |
| 13 | NVIDIA GeForce RTX 5090 nvidia · enthusiast · 32GB | B | 630 | Estimated |
| 14 | NVIDIA GeForce RTX 5080 nvidia · enthusiast · 16GB · 12 bench | B | 596 | Measured |
| 15 | NVIDIA RTX 4090 48GB (China-mod) nvidia · workstation · 48GB | B | 534 | Estimated |
| 16 | NVIDIA RTX 5000 PRO Blackwell 48GB nvidia · workstation · 48GB | B | 529 | Estimated |
| 17 | NVIDIA RTX 6000 Ada Generation nvidia · workstation · 48GB | B | 529 | Estimated |
| 18 | NVIDIA GeForce RTX 3090 Ti nvidia · enthusiast · 24GB | B | 520 | Estimated |
| 19 | NVIDIA GeForce RTX 4090 nvidia · enthusiast · 24GB | B | 520 | Estimated |
| 20 | NVIDIA GeForce RTX 5090 Mobile nvidia · enthusiast · 24GB | B | 512 | Estimated |
| 21 | NVIDIA RTX PRO 4500 Blackwell nvidia · workstation · 32GB | B | 507 | Estimated |
| 22 | NVIDIA GeForce RTX 3090 nvidia · enthusiast · 24GB | B | 505 | Estimated |
| 23 | NVIDIA L40 nvidia · workstation · 48GB | B | 503 | Estimated |
| 24 | NVIDIA L40S nvidia · workstation · 48GB | B | 500 | Estimated |
Amazon search links — we may earn a small commission at no extra cost to you. How we make money.
Steady-state tok/s on a representative 7B/8B Q4 model. Measured from real benchmark rows, or extrapolated from VRAM bandwidth × runtime-stack efficiency.
How comfortably the rig holds 7B / 32B / 70B class models. Apple unified memory counts; NPU/SoC system RAM counts.
CUDA / MLX / ROCm / Vulkan reach. Real-world friction the operator hits when installing tools.
Tok/s per watt. Mobile / NPU class scores well; dense desktop GPUs trade efficiency for absolute throughput.
A confidence multiplier (1.0 measured · 0.85 extrapolated · 0.7 estimated) discounts the headline so we don't pretend to know more than we do. Score is recomputed on every page load against the latest catalog + benchmark data — submit your own run with runlocalai-bench --submit --hardware your-rig to firm up the numbers.