nvidia

GPU

141GB VRAM

workstation

NVIDIA H200

Hopper refresh — 141GB HBM3e at ~4.8 TB/s. Datacenter-class; rentable on RunPod, Lambda, etc.

Released 2024

Overview

Hopper refresh — 141GB HBM3e at ~4.8 TB/s. Datacenter-class; rentable on RunPod, Lambda, etc.

Specs

VRAM	141 GB
Power draw	700 W
Released	2024
MSRP	$31000
Backends	CUDA

Models that fit

Open-weight models small enough to run on NVIDIA H200 with usable context.

Llama 3.1 8B Instruct

Qwen 2.5 Coder 32B Instruct

Llama 3.3 70B Instruct

Gemma 4 31B Dense

DeepSeek R1 Distill Llama 70B

70B · deepseek

Frequently asked

What models can NVIDIA H200 run?

With 141GB VRAM, the NVIDIA H200 runs 70B models in 4-bit quantization, plus everything smaller. See the model list below for tested combinations.

Does NVIDIA H200 support CUDA?

Yes — NVIDIA H200 is an NVIDIA card with full CUDA support, the most mature local-AI backend. llama.cpp, Ollama, vLLM, and ExLlamaV2 all run natively.

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify hardware specifications.