ROCm not detected — get an AMD GPU running for local AI
ROCm is finicky on consumer AMD GPUs in 2026. Here's the install order, the gfx-version override that fixes 80% of detection failures, and when to give up and use Vulkan.
Diagnostic order — most likely first
Card isn't officially supported by your ROCm version
Run `rocminfo`. Empty output or 'No GPU agents found.' Most consumer Radeon cards (7900 XTX, 7900 XT, 6800 XT) work but require explicit gfx version overrides.
Set `HSA_OVERRIDE_GFX_VERSION=11.0.0` for RDNA 3 cards (7900 XTX / XT / GRE) before running ROCm workloads. For RDNA 2 (6800 XT, 6900 XT): `HSA_OVERRIDE_GFX_VERSION=10.3.0`.
ROCm + driver version mismatch
`dkms status` shows the AMDGPU kernel module loaded. `rocminfo` still empty. `dmesg` shows AMDGPU errors.
Match ROCm version to your driver. ROCm 6.x needs amdgpu-dkms 6.x. Reinstall in this order: `amdgpu-install --usecase=rocm,dkms` then reboot.
User not in the render / video groups
`groups` doesn't list `render` or `video`. ROCm permission errors in dmesg.
`sudo usermod -aG render,video $USER` then log out and back in.
Trying to run ROCm on Windows native (only WSL is supported)
You're on Windows, not WSL2. ROCm doesn't ship a Windows-native PyTorch path for consumer cards.
Either move to WSL2 + Ubuntu and install ROCm there, or use llama.cpp's Vulkan backend for Windows-native AMD inference. Vulkan is 70-90% as fast as ROCm for inference and 100% as compatible.
Card is just unsupported and nothing works
You've tried gfx overrides, driver matches, group permissions. Still nothing. This is real for older / lower-tier RDNA cards.
Use llama.cpp with Vulkan backend (`-DGGML_VULKAN=ON`). Works on basically any modern AMD card. Or replace the card with a CUDA-stack alternative.
ROCm detection issues sometimes resolve into the harder question: should you have bought NVIDIA instead? The guide below frames the CUDA-vs-ROCm decision honestly.
Frequently asked questions
Is ROCm worth it on consumer AMD GPUs in 2026?
On a 7900 XTX with the gfx-version override, yes — you get 70-85% of CUDA performance for the same workloads at often half the card cost. On older or lower-tier cards (6700 XT, 6600), ROCm is a fight; Vulkan via llama.cpp is the saner path.
ROCm vs CUDA — which is better for local AI?
CUDA wins on ecosystem breadth (vLLM, TensorRT, day-zero new model support, more documentation). ROCm wins on $/GB-VRAM (a 24 GB 7900 XTX is half the cost of a 24 GB 4090). For inference-only workflows on supported cards, ROCm is fine. For research / training / cutting-edge model loading, CUDA still leads.
Can I use ROCm on Windows?
Only via WSL2 + Ubuntu. There's no native Windows ROCm for consumer cards. If you must run native Windows, llama.cpp's Vulkan backend is the alternative — supports any modern AMD card and runs at 70-90% of ROCm performance for LLM inference.
Related troubleshooting
Why CUDA OOM happens during local LLM inference and image gen, how to confirm the real cause, and the four real fixes (smaller quant, shorter context, gradient checkpointing, or more VRAM).
Ollama silently falls back to CPU when it can't load a model into VRAM. Here's how to confirm the fallback, force GPU usage, and pick a model that actually fits.
When the fix is hardware
A surprising fraction of troubleshooting tickets resolve to: this card doesn't have enough VRAM for what you're asking it to do. If you're hitting OOM after every reasonable fix, or your GPU genuinely can't fit the model you need, it's upgrade time: