08. Docker Issues
GPU Passthrough Failures
# Verify NVIDIA runtime is Docker's default
docker info | grep -i nvidia
# Should output: "Default Runtime: nvidia"
# If not, create /etc/docker/daemon.json
sudo tee /etc/docker/daemon.json <<EOF
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
EOF
sudo systemctl restart docker
Volume Mount Failures
Models stored on the host must be mounted into the container:
# Correct: mount the directory containing the model
docker run -v /home/user/models:/models your-image python generate.py --model /models/llama-2-7b
# Incorrect: mounting a single file can cause permission issues
docker run -v /home/user/models/llama-2-7b/config.json:/config.json your-image
Image Build Failures
# Build with build arguments matching host CUDA version
docker build \
--build-arg CUDA_VERSION=12.1 \
--build-arg TORCH_CUDA_ARCH_LIST="8.0;8.6;8.9;9.0" \
-t your-model-image .
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Build a Docker image with a simple Python inference script. Verify GPU access inside the container with nvidia-smi. Commit the image, push it to a registry, and pull it on a different machine to confirm reproducibility.