RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Troubleshooting
  4. /Docker container cannot access GPU / `--gpus all` fails
fatal✓Editorial·Reviewed May 2026

Docker can't see GPU — wire up the NVIDIA Container Toolkit

Docker doesn't expose the host GPU by default. The NVIDIA Container Toolkit is the bridge. Here's the install + the runtime config + the four common symptoms that mean it's misconfigured.

DockerNVIDIA Container ToolkitLinuxWSL2 + Docker DesktopKubernetes
By Fredoline Eruo · Last verified 2026-05-08

Diagnostic order — most likely first

#1

NVIDIA Container Toolkit not installed

Diagnose

`docker run --rm --gpus all nvidia/cuda:12.4.0-base nvidia-smi` returns 'could not select device driver "" with capabilities: [[gpu]]'.

Fix

Install: `sudo apt install nvidia-container-toolkit` then `sudo nvidia-ctk runtime configure --runtime=docker` then `sudo systemctl restart docker`. Re-run the test command.

#2

Container has the toolkit but no `--gpus all` flag passed

Diagnose

`docker exec` into the container, run `nvidia-smi` — fails. Outside the container the GPU is visible.

Fix

Pass `--gpus all` to `docker run`. For docker-compose: add `deploy.resources.reservations.devices: [{ driver: nvidia, count: all, capabilities: [gpu] }]`.

#3

Wrong base image (CUDA version mismatch)

Diagnose

Container starts but PyTorch / TensorFlow inside fails with 'CUDA driver version is insufficient' or kernel module errors.

Fix

Match the base image's CUDA version to your host driver. CUDA 12.4 image needs driver ≥ 550. Use NVIDIA's official `nvidia/cuda:<version>-runtime-ubuntu22.04` images — the suffix maps to a known driver requirement.

#4

Docker Desktop on Windows without WSL Integration enabled

Diagnose

WSL terminal sees the GPU. Docker Desktop containers don't.

Fix

Docker Desktop → Settings → Resources → WSL Integration → enable for your distro. Restart Docker Desktop. Also ensure WSL2 GPU passthrough works first (see /troubleshooting/wsl-gpu-not-detected).

#5

Kubernetes / k3s missing the device plugin

Diagnose

Single-host Docker works fine. Kubernetes pods can't request GPU.

Fix

Install the `nvidia-device-plugin` DaemonSet: `kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/...`. Then declare resources: `nvidia.com/gpu: 1` in pod specs.

Frequently asked questions

Do containers add inference overhead vs running on the host?

Negligible. The NVIDIA Container Toolkit shares the host's GPU directly via cgroup pinning and device passthrough — there's no virtualization layer. Inference performance is within 1% of host-native.

Can I share a GPU across multiple containers?

Yes by default — multiple containers with `--gpus all` see the same card. For isolation, use NVIDIA MPS (Multi-Process Service) or MIG (Multi-Instance GPU on A100/H100) to partition. Most local-AI workflows don't need this.

Why use Docker for local AI at all?

Reproducible runtime environments, isolation from host driver chaos, and easy switching between CUDA versions for different projects. Trade-off: complexity. For solo workflows, a host-native conda env is often simpler.

Related troubleshooting

WSL2 cannot see GPU / nvidia-smi fails inside WSL

WSL2 doesn't pass the GPU through unless the host driver is right and the kernel is current. Here's the install order that actually works in 2026, and how to confirm passthrough is live before you waste an afternoon.

CUDA out of memory

Why CUDA OOM happens during local LLM inference and image gen, how to confirm the real cause, and the four real fixes (smaller quant, shorter context, gradient checkpointing, or more VRAM).

torch.cuda.is_available() returns False

PyTorch falsely reporting no CUDA is the most common Python ML setup failure. The cause is almost always: wrong PyTorch wheel for your CUDA version, or a CPU-only build accidentally installed.

When the fix is hardware

A surprising fraction of troubleshooting tickets resolve to: this card doesn't have enough VRAM for what you're asking it to do. If you're hitting OOM after every reasonable fix, or your GPU genuinely can't fit the model you need, it's upgrade time:

  • Best GPU for local AI
  • Best laptop for local AI
  • Best Mac for local AI

Where next?

All troubleshooting guides
OrBest GPU for local AIWill it run on my hardware?