Every catalog hardware unit ranked by composite score (0–1000): measured tok/s, VRAM fit, ecosystem support, perf-per-watt. 2 of 128 ranks anchored to a measured benchmark — the rest are honestly flagged as extrapolated or estimated.
Methodology: /methodology · Run your own: curl -fsSL runlocalai.co/bench.mjs -o bench.mjs && node bench.mjs
7 units shown · sorted by score
| # | Hardware | Tier | Score | Data |
|---|---|---|---|---|
| 1 | Intel Gaudi 2 intel · workstation · 96GB | D | 168 | Estimated |
| 2 | Intel Gaudi 3 intel · workstation · 128GB | D | 168 | Estimated |
| 3 | Intel Arc A770 16GB intel · mid · 16GB | D | 154 | Estimated |
| 4 | Intel Arc B580 intel · mid · 12GB | D | 133 | Estimated |
| 5 | Intel Arc 140V (Lunar Lake iGPU) intel · entry | D | 115 | Estimated |
| 6 | Intel Arc B570 intel · mid · 10GB | D | 112 | Estimated |
| 7 | Intel Core Ultra 7 258V (Lunar Lake) intel · pc-npu | D | 100 | Estimated |
Steady-state tok/s on a representative 7B/8B Q4 model. Measured from real benchmark rows, or extrapolated from VRAM bandwidth × runtime-stack efficiency.
How comfortably the rig holds 7B / 32B / 70B class models. Apple unified memory counts; NPU/SoC system RAM counts.
CUDA / MLX / ROCm / Vulkan reach. Real-world friction the operator hits when installing tools.
Tok/s per watt. Mobile / NPU class scores well; dense desktop GPUs trade efficiency for absolute throughput.
A confidence multiplier (1.0 measured · 0.85 extrapolated · 0.7 estimated) discounts the headline so we don't pretend to know more than we do. Score is recomputed on every page load against the latest catalog + benchmark data — submit your own run with runlocalai-bench --submit --hardware your-rig to firm up the numbers.