nvidia
GPU
24GB VRAM
workstation

NVIDIA L4

Inference-focused Ada datacenter card. Low-power 24GB suitable for 7B-14B serving.

Released 2023

Overview

Inference-focused Ada datacenter card. Low-power 24GB suitable for 7B-14B serving.

Specs

VRAM24 GB
Power draw72 W
Released2023
MSRP$2500
Backends
CUDA

Models that fit

Open-weight models small enough to run on NVIDIA L4 with usable context.

Compare alternatives

Hardware worth comparing

Same VRAM tier and the one step above and below — so you can frame the buying decision against real options.

Step up
More VRAM — bigger models, more context
No verdicted hardware in the next tier up yet.
Step down
Less VRAM — cheaper, more constrained

Frequently asked

What models can NVIDIA L4 run?

With 24GB VRAM, the NVIDIA L4 runs models up to ~32B in 4-bit, with room for context. See the model list below for tested combinations.

Does NVIDIA L4 support CUDA?

Yes — NVIDIA L4 is an NVIDIA card with full CUDA support, the most mature local-AI backend. llama.cpp, Ollama, vLLM, and ExLlamaV2 all run natively.

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify hardware specifications.