Hardware & infrastructure
NVLink
NVLink is NVIDIA's proprietary GPU-to-GPU interconnect, used to bind multiple data-center GPUs into a coherent memory fabric. NVLink 4 (H100) runs at 900 GB/s bidirectional per link; multiple links per GPU stack to total bandwidth.
For local AI, NVLink matters when running multi-GPU tensor parallelism: a 70B model split across 2× RTX 3090s with NVLink hits significantly higher tok/s than the same setup over PCIe 4.0 (32 GB/s) because of the all-reduces between layers.
Consumer NVLink ended with the RTX 30 series. RTX 40 and 50 series have no NVLink — multi-GPU on consumer cards now relies on PCIe alone, which is the major bottleneck for tensor-parallel local inference.
Related terms
See also
Reviewed by Fredoline Eruo. See our editorial policy.
When it doesn't work
Compare hardware
Hardware