17. GPU Not Detected

Chapter 17 of 20 · 20 min

When Ollama fails to detect your GPU, it falls back to CPU inference, which is significantly slower. This chapter covers diagnosing and fixing GPU detection issues.

Symptoms

  • ollama ps shows no GPU usage
  • Load times are long (30+ seconds)
  • Tokens per second is very low (<10)
  • No GPU appears in nvidia-smi output while running

NVIDIA GPU Diagnostics

Check if CUDA is available:

nvidia-smi

If this fails, the NVIDIA driver is not installed or not in PATH.

On Linux:

# Check driver version
cat /proc/driver/nvidia/version

# Check CUDA installation
nvcc --version

# Verify device nodes exist
ls -la /dev/nvidia*

Common issues:

  1. Driver not loaded - Run sudo modprobe nvidia or reinstall the driver
  2. nvidia-container-toolkit not installed - Required for Docker GPU passthrough
  3. Multiple NVIDIA GPUs - May need CUDA_VISIBLE_DEVICES to select one

On Windows:

# Check Device Manager
Get-PhysicalAdapter | Where-Object { $_.AdapterType -like "*NVIDIA*" }

# Reinstall driver if needed
# Download from nvidia.com drivers

Docker GPU Passthrough

Even with nvidia-smi working on the host, the container may not see the GPU:

# Verify nvidia-container-toolkit
docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu22.04 nvidia-smi

If this fails, install nvidia-container-toolkit:

# Debian/Ubuntu
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \\
    sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt update
sudo apt install nvidia-container-toolkit
sudo systemctl restart docker

Environment Variable Checks

# Check if Ollama sees GPUs
OLLAMA_DEBUG=1 ollama run llama3.2:1b

The OLLAMA_DEBUG=1 flag prints diagnostic information, including detected GPUs.

Forcing CPU Mode

If you cannot fix GPU detection, fall back to CPU mode:

# Linux/macOS
CUDA_VISIBLE_DEVICES="" ollama run llama3.2:3b

# Windows PowerShell
$env:CUDA_VISIBLE_DEVICES = ""
ollama run llama3.2:3b

This forces CPU inference but provides a workaround while you debug.

AMD GPU Issues

AMD GPUs require ROCm on Linux:

# Check ROCm installation
rocm-smi

# Verify device
ls /dev/kfd

If ROCm is not installed, download from rocm.com and follow the Linux installation guide. AMD GPU support is limited to Linux only-Windows and macOS do not support ROCm with Ollama.

EXERCISE

Run nvidia-smi and OLLAMA_DEBUG=1 ollama run llama3.2:1b. Compare the output to identify what Ollama sees (or does not see). Fix the missing component.