BLK · LEADERBOARD

RunLocalAI Score

Every catalog hardware unit ranked by composite score (0–1000): measured tok/s, VRAM fit, ecosystem support, perf-per-watt. 2 of 154 ranks anchored to a measured benchmark — the rest are honestly flagged as extrapolated or estimated.

Methodology: /methodology · Run your own: curl -fsSL runlocalai.co/bench.mjs -o bench.mjs && node bench.mjs

Vendor:All amd apple google intel nvidia qualcomm

VRAM:All ≤8GB 9–16GB 17–24GB 25–48GB 49–96GB 97GB+

Tier:All S A B C D

Data:All Measured Measured-near Community Extrapolated Estimated

37 units shown · sorted by score

#	Hardware	Tier	Score	Throughput	VRAM-fit	Ecosystem	Efficiency	Data
1	NVIDIA H20 (96GB) nvidia · workstation · 96GB	B	697	500	200	200	96	Estimated
2	NVIDIA H200 NVL (PCIe) nvidia · workstation · 141GB	B	684	500	200	200	77	Estimated
3	NVIDIA B200 nvidia · workstation · 192GB	B	684	500	200	200	77	Estimated
4	NVIDIA H200 nvidia · workstation · 141GB	B	676	500	200	200	66	Estimated
5	NVIDIA B300 (Blackwell Ultra) nvidia · workstation · 288GB	B	669	500	200	200	55	Estimated
6	NVIDIA H100 NVL nvidia · workstation · 188GB	B	663	500	200	200	47	Estimated
7	NVIDIA H100 PCIe nvidia · workstation · 80GB	B	662	500	190	200	56	Estimated
8	NVIDIA A100 80GB SXM nvidia · workstation · 80GB	B	657	500	190	200	49	Estimated
9	NVIDIA H100 SXM nvidia · workstation · 80GB	B	655	500	190	200	46	Estimated
10	NVIDIA RTX PRO 6000 Blackwell nvidia · workstation · 96GB	B	650	500	200	200	29	Estimated
11	NVIDIA A100 40GB nvidia · workstation · 40GB	B	635	500	170	200	37	Estimated
12	NVIDIA GB200 NVL72 nvidia · workstation · 13824GB	B	631	500	200	200	1	Estimated
13	NVIDIA GeForce RTX 5090 nvidia · enthusiast · 32GB	B	630	500	170	200	30	Estimated
14	AMD Instinct MI355X amd · workstation · 288GB	B	626	500	200	130	64	Estimated
15	AMD Instinct MI350X amd · workstation · 288GB	B	626	500	200	130	64	Estimated
16	AMD Instinct MI300X amd · workstation · 192GB	B	621	500	200	130	57	Estimated
17	AMD Instinct MI300A (APU) amd · workstation · 128GB	B	620	500	200	130	56	Estimated
18	Apple M4 Ultra apple · enthusiast	B	615	447	200	170	62	Estimated
19	AMD Instinct MI325X amd · workstation · 256GB	B	615	500	200	130	48	Estimated
20	AMD Instinct MI250X amd · workstation · 128GB	B	614	500	200	130	47	Estimated
21	AMD Instinct MI210 amd · workstation · 64GB	B	587	475	190	130	44	Estimated
22	Intel Gaudi 2 intel · workstation · 96GB	B	536	500	200	40	26	Estimated
23	Intel Gaudi 3 intel · workstation · 128GB	B	536	500	200	40	26	Estimated
24	NVIDIA RTX 4090 48GB (China-mod) nvidia · workstation · 48GB	B	534	351	190	200	22	Estimated
25	Apple M1 Ultra apple · enthusiast	B	529	325	200	170	60	Estimated
26	NVIDIA RTX 5000 PRO Blackwell 48GB nvidia · workstation · 48GB	B	529	334	190	200	31	Estimated
27	NVIDIA RTX 6000 Ada Generation nvidia · workstation · 48GB	B	529	334	190	200	31	Estimated
28	Apple M3 Ultra apple · enthusiast	B	522	325	200	170	50	Estimated
29	Apple M2 Ultra apple · enthusiast	B	522	325	200	170	50	Estimated
30	NVIDIA GeForce RTX 3090 Ti nvidia · enthusiast · 24GB	B	520	351	170	200	22	Estimated
31	NVIDIA GeForce RTX 4090 nvidia · enthusiast · 24GB	B	520	351	170	200	22	Estimated
32	NVIDIA GeForce RTX 5090 Mobile nvidia · enthusiast · 24GB	B	512	312	170	200	49	Estimated
33	Apple Mac Studio (M3 Ultra) apple · enthusiast	B	512	325	200	170	36	Estimated
34	NVIDIA RTX PRO 4500 Blackwell nvidia · workstation · 32GB	B	507	312	170	200	43	Estimated
35	NVIDIA GeForce RTX 3090 nvidia · enthusiast · 24GB	B	505	326	170	200	26	Estimated
36	NVIDIA L40 nvidia · workstation · 48GB	B	503	301	190	200	28	Estimated
37	NVIDIA L40S nvidia · workstation · 48GB	B	500	301	190	200	24	Estimated

BLK · BUY · AMAZON

Shop GPUs & AI hardware on Amazon:GPU categoryRTX 4090RTX 5090Apple M-seriesAI mini-PCs

Amazon search links — we may earn a small commission at no extra cost to you. How we make money.

HOW THE SCORE IS DERIVED

Throughput · 0–500

Steady-state tok/s on a representative 7B/8B Q4 model. Measured from real benchmark rows, or extrapolated from VRAM bandwidth × runtime-stack efficiency.

VRAM-fit · 0–200

How comfortably the rig holds 7B / 32B / 70B class models. Apple unified memory counts; NPU/SoC system RAM counts.

Ecosystem · 0–200

CUDA / MLX / ROCm / Vulkan reach. Real-world friction the operator hits when installing tools.

Efficiency · 0–100

Tok/s per watt. Mobile / NPU class scores well; dense desktop GPUs trade efficiency for absolute throughput.

A confidence multiplier (1.0 measured · 0.85 extrapolated · 0.7 estimated) discounts the headline so we don't pretend to know more than we do. Score is recomputed on every page load against the latest catalog + benchmark data — submit your own run with runlocalai-bench --submit --hardware your-rig to firm up the numbers.

> RunLocalAI Score

RunLocalAI Score