nvidia
GPU
40GB VRAM
workstation
NVIDIA A100 40GB
Original A100. 40GB HBM2 at 1.55 TB/s. Trained the early generation of frontier models.
Released 2020
Overview
Original A100. 40GB HBM2 at 1.55 TB/s. Trained the early generation of frontier models.
Specs
| VRAM | 40 GB |
| Power draw | 400 W |
| Released | 2020 |
| MSRP | $11000 |
| Backends | CUDA |
Models that fit
Open-weight models small enough to run on NVIDIA A100 40GB with usable context.
Frequently asked
What models can NVIDIA A100 40GB run?
With 40GB VRAM, the NVIDIA A100 40GB runs 70B models in 4-bit quantization, plus everything smaller. See the model list below for tested combinations.
Does NVIDIA A100 40GB support CUDA?
Yes — NVIDIA A100 40GB is an NVIDIA card with full CUDA support, the most mature local-AI backend. llama.cpp, Ollama, vLLM, and ExLlamaV2 all run natively.
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify hardware specifications.