UNIT · NVIDIA · GPU

141 GB VRAMworkstationReviewed May 2026

NVIDIA H200 NVL (PCIe)

NVDA · HARDWARE

No editorial image yet — generic vendor mark shown. Credentials in spec table below.

The PCIe-form-factor variant of the H200. Same 141 GB HBM3e, same memory subsystem (~4.8 TB/s bandwidth), in a dual-slot workstation card rather than the SXM5 datacenter module. For operators who want H200-class capacity in a standard PCIe slot (workstation deployments, single-node serving), the NVL is the realistic path. Two cards in NVLink Bridge pair to act as an effective 282 GB pool.

Released 2024·~$32000 street·4800 GB/s memory bandwidth

RUNLOCALAI SCORE

See full leaderboard →

684/ 1000

BB-tier

Estimated

Throughput

500/ 500

VRAM-fit

200/ 200

Ecosystem

200/ 200

Efficiency

77/ 100

Sub-scores sum to 977 / 1000. Headline = 977 × 0.70 (Estimated-confidence discount) = 684. This is an algorithmic performance-tier score — distinct from, and often lower than, the editorial “Our verdict” below, which weighs value and real-world fit (especially for hardware we haven’t measured yet). How scoring works →

Extrapolated from 4800 GB/s bandwidth — 576.0 tok/s estimated. No measured benchmarks yet.

WORKLOAD FIT

Try other hardware →

Plain-English: Runs 70B comfortably — snappy enough for a coding agent; vision models supported.

7B chat✓

Comfortable

14B chat✓

Comfortable

32B chat✓

Comfortable

70B chat✓

Comfortable

Coding agent✓

Comfortable

Vision (≤8B VLM)✓

Comfortable

Long context (32K)✓

Comfortable

✓Comfortable — fits with headroom

~Tight — works, no slack

△Marginal — needs aggressive quant

✗Doesn't fit usefully

Verdicts extrapolated from catalog VRAM + bandwidth + ecosystem flags. Hover any chip for the rationale. Want measured numbers? Submit your own run with runlocalai-bench --submit.

BLK · OVERVIEW

Overview

Retailers we'd check:Amazon

Search-fallback link — editorial hasn't yet curated a retailer URL for this card. Approx. $32000.

Some links above are affiliate links. We may earn a commission at no extra cost to you. How we make money.

BLK · SPECS

Specs

VRAM	141 GB
Power draw (peak)	600 W
Released	2024
Backends	CUDA

Models that fit

Open-weight models small enough to run on NVIDIA H200 NVL (PCIe) with usable context.

Nomic Embed Text v1.5

0.137B · other

Kokoro 82M

0.082B · other

Llama 3.1 8B Instruct

8B · llama

Qwen 3 30B-A3B

30B · qwen

Frequently asked

What models can NVIDIA H200 NVL (PCIe) run?

With 141GB VRAM, the NVIDIA H200 NVL (PCIe) runs 70B models in 4-bit quantization, plus everything smaller. See the model list below for tested combinations.

Does NVIDIA H200 NVL (PCIe) support CUDA?

Yes — NVIDIA H200 NVL (PCIe) is an NVIDIA card with full CUDA support, the most mature local-AI backend. llama.cpp, Ollama, vLLM, and ExLlamaV2 all run natively.

How much does NVIDIA H200 NVL (PCIe) cost?

Current street price for NVIDIA H200 NVL (PCIe) is around $32000. Prices vary by region and supply.

Where next?

Buyer guides

Troubleshooting

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify hardware specifications.

NVIDIA H200 NVL (PCIe)

NVDA · HARDWARE

NVIDIA H200 NVL (PCIe)

No editorial image yet — generic vendor mark shown. Credentials in spec table below.

Released 2024·~$32000 street·4800 GB/s memory bandwidth

Frequently asked

What models can NVIDIA H200 NVL (PCIe) run?

With 141GB VRAM, the NVIDIA H200 NVL (PCIe) runs 70B models in 4-bit quantization, plus everything smaller. See the model list below for tested combinations.

Does NVIDIA H200 NVL (PCIe) support CUDA?

Yes — NVIDIA H200 NVL (PCIe) is an NVIDIA card with full CUDA support, the most mature local-AI backend. llama.cpp, Ollama, vLLM, and ExLlamaV2 all run natively.

How much does NVIDIA H200 NVL (PCIe) cost?

Current street price for NVIDIA H200 NVL (PCIe) is around $32000. Prices vary by region and supply.