nvidia
GPU
24GB VRAM
workstation
NVIDIA L4
Inference-focused Ada datacenter card. Low-power 24GB suitable for 7B-14B serving.
Released 2023
Overview
Inference-focused Ada datacenter card. Low-power 24GB suitable for 7B-14B serving.
Specs
| VRAM | 24 GB |
| Power draw | 72 W |
| Released | 2023 |
| MSRP | $2500 |
| Backends | CUDA |
Models that fit
Open-weight models small enough to run on NVIDIA L4 with usable context.
Compare alternatives
Hardware worth comparing
Same VRAM tier and the one step above and below — so you can frame the buying decision against real options.
Same VRAM tier
Cards in the same memory band
Step up
More VRAM — bigger models, more context
No verdicted hardware in the next tier up yet.
Step down
Less VRAM — cheaper, more constrained
Frequently asked
What models can NVIDIA L4 run?
With 24GB VRAM, the NVIDIA L4 runs models up to ~32B in 4-bit, with room for context. See the model list below for tested combinations.
Does NVIDIA L4 support CUDA?
Yes — NVIDIA L4 is an NVIDIA card with full CUDA support, the most mature local-AI backend. llama.cpp, Ollama, vLLM, and ExLlamaV2 all run natively.
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify hardware specifications.