nvidia
GPU
48GB VRAM
workstation
NVIDIA L40
Original Ada datacenter. Slower than L40S. 48GB GDDR6.
Released 2022
Overview
Original Ada datacenter. Slower than L40S. 48GB GDDR6.
Specs
| VRAM | 48 GB |
| Power draw | 300 W |
| Released | 2022 |
| MSRP | $8000 |
| Backends | CUDA |
Models that fit
Open-weight models small enough to run on NVIDIA L40 with usable context.
Frequently asked
What models can NVIDIA L40 run?
With 48GB VRAM, the NVIDIA L40 runs 70B models in 4-bit quantization, plus everything smaller. See the model list below for tested combinations.
Does NVIDIA L40 support CUDA?
Yes — NVIDIA L40 is an NVIDIA card with full CUDA support, the most mature local-AI backend. llama.cpp, Ollama, vLLM, and ExLlamaV2 all run natively.
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify hardware specifications.