ROCm (AMD)
ROCm (Radeon Open Compute) is AMD's open-source equivalent of NVIDIA's CUDA. It's required for any meaningful AMD GPU inference — vLLM ROCm builds, llama.cpp HIP backend, ExLlamaV2 ROCm, PyTorch ROCm. Without it, AMD cards fall back to CPU or Vulkan, which is dramatically slower for LLM inference.
In 2026, ROCm is mature on Linux for current-generation consumer cards (RX 7900 XTX, RX 9070 XT) and datacenter chips (MI300, MI250). Older Polaris and Vega cards are unsupported in current ROCm — confirm your card is in the support matrix before committing time. Windows ROCm is improving but trails Linux by 6-12 months for LLM workloads; production AMD deployments live on Linux.
The operator-honest framing: AMD-on-Linux is a real production path; AMD-on-Windows is hobby-tier in 2026. ROCm 6.x supports the same patterns as CUDA — Docker containers via amdgpu-container-toolkit, Triton kernels via HIPify, FA2/3 ports — but the community + tooling density still trails CUDA. Choose AMD when budget is the constraint and your team can run Linux; choose NVIDIA when ecosystem maturity matters more than card price.