Docker on Linux for AI — Local AI on Linux (Chapter 9)

Docker on Linux is the only configuration where GPU containers run with native performance, no virtualization layer, and direct device access. The nvidia-container-toolkit replaces the older nvidia-docker2 package.

Install Docker:

curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
newgrp docker

Install the NVIDIA container toolkit:

distribution=$(. /etc/os-release && echo "$ID$VERSION_ID")
curl -fsSL https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt update
sudo apt install nvidia-container-toolkit
sudo systemctl restart docker

Test GPU passthrough:

docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 \
  nvidia-smi

Test a full inference container:

docker run --rm --gpus all \
  -v /path/to/models:/models \
  ghcr.io/ggerganov/llama.cpp:server \
  ./server -m /models/mistral-7b-q4_k_m.gguf -ngl 99 -host 0.0.0.0

Failure mode: docker run --gpus all returns docker: Error response from daemon: could not select runtime: nvidia-container-runtime not found. The container runtime hook was not registered. Run sudo nvidia-ctk runtime configure --runtime=docker and restart Docker.

Failure mode: GPU device not found inside container. Check docker run --rm --gpus all ubuntu nvidia-smi works but a specific image fails. The failing image was built without CUDA base layers. Rebuild it with FROM nvidia/cuda:12.4.0-runtime-ubuntu22.04.

Failure mode: Docker fails to start after installing the container toolkit. docker ps returns Cannot connect to the Docker daemon. The nvidia-container-runtime package may have installed conflicting dependencies. Check apt list --installed | grep nvidia and remove conflicting packages.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.