Windows can't find CUDA — fix the driver / toolkit / PATH chain
Windows CUDA loading errors trace to a driver-vs-toolkit version skew, a PATH that doesn't include CUDA bin, or a CPU-only PyTorch wheel. Check nvidia-smi first, then the wheel suffix, then PATH.
Diagnostic order — most likely first
NVIDIA driver too old for installed CUDA toolkit
Run `nvidia-smi` in PowerShell. Upper-right shows max CUDA the driver supports. If it shows 12.0 but PyTorch wants 12.4, that's the gap.
Update driver from nvidia.com (Game Ready or Studio, both fine). Reboot. Driver 555+ supports CUDA 12.4. Verify with `nvidia-smi` showing CUDA Version: 12.4 or higher.
CPU-only PyTorch wheel was installed
`python -c "import torch; print(torch.__version__)"` shows e.g. `2.5.1+cpu`. The `+cpu` suffix is the smoking gun.
Reinstall correctly: `pip install --upgrade --force-reinstall torch torchvision --index-url https://download.pytorch.org/whl/cu124`. Verify version ends in `+cu124`.
CUDA Toolkit not in PATH
`nvcc --version` returns 'command not found.' CUDA was installed but not added to System PATH.
Add to PATH via System Properties → Advanced → Environment Variables. Add `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin` to System PATH. Restart any open terminal.
Multiple CUDA versions creating DLL conflicts
Older CUDA installs left behind DLLs in System32 or PATH. App tries to load the wrong nvcuda.dll.
Uninstall old CUDA Toolkits via Control Panel. Keep one version. Restart Windows after. Verify with `where nvcuda.dll` showing one path.
Antivirus / Windows Defender blocking CUDA DLL
Rare but real. Defender or third-party AV quarantines a CUDA component. Check Windows Security → Protection History.
Restore the file. Add CUDA Toolkit folder to Windows Defender exclusions. Re-verify the install: `nvidia-smi` + `nvcc --version`.
Frequently asked questions
Should I install CUDA Toolkit on Windows for PyTorch?
Not strictly required — PyTorch's wheel ships its own CUDA runtime. You only need a separate CUDA Toolkit install if you're compiling something (Flash Attention, custom kernels). For pure inference + most fine-tuning, the wheel is enough.
Driver vs toolkit — what's the difference?
The driver (nvidia-smi) is what the OS uses to talk to the GPU. The toolkit (nvcc) is the compiler + libraries for building CUDA code. PyTorch wheels include the libraries you need; they only require a compatible driver.
Should I use WSL2 instead of Windows-native CUDA?
For most workflows, WSL2 + Ubuntu is more reliable than Windows-native. Better runtime support (vLLM, FlashAttention), simpler troubleshooting, identical performance. Windows-native works for ComfyUI / A1111 / Ollama; for serious development, WSL2 wins.
Related troubleshooting
When PyTorch / vLLM / a CUDA app errors on 'CUDA driver version is insufficient' or 'no kernel image,' the host driver is too old (or sometimes too new) for the installed toolkit. Read nvidia-smi's max-CUDA, match it.
PyTorch falsely reporting no CUDA is the most common Python ML setup failure. The cause is almost always: wrong PyTorch wheel for your CUDA version, or a CPU-only build accidentally installed.
WSL2 doesn't pass the GPU through unless the host driver is right and the kernel is current. Here's the install order that actually works in 2026, and how to confirm passthrough is live before you waste an afternoon.
When the fix is hardware
A surprising fraction of troubleshooting tickets resolve to: this card doesn't have enough VRAM for what you're asking it to do. If you're hitting OOM after every reasonable fix, or your GPU genuinely can't fit the model you need, it's upgrade time: