How to create a Docker Compose setup for AI stack
Docker and Docker Compose installed
What this does
Defines a multi-service Docker Compose file that orchestrates an inference server alongside supporting services, enabling repeatable startup of a complete local AI deployment with a single command.
Steps
Create the project directory and docker-compose.yml file.
mkdir ai-stack && cd ai-stack touch docker-compose.ymlExpected output: Directory created, empty compose file ready for editing.
Define the inference service with GPU access and volume mounts.
services: inference: image: vllm/vllm-openai:latest deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] volumes: - ./models:/models:ro ports: - "8000:8000" command: --model /models/llama-model --gpu-memory-utilization 0.85Expected output: Valid YAML parsed without error by
docker compose config.Add an Ollama service for model management.
ollama: image: ollama/ollama:latest volumes: - ollama-data:/root/.ollama ports: - "11434:11434" volumes: ollama-data:Expected output: Configuration included without syntax errors.
Start all services in detached mode.
docker compose up -dExpected output: Container list shows all services with status "running".
Verification
docker compose ps
# Expected: table listing all services with "running" state and mapped ports
Common failures
- GPU not available to container — Ensure the
deploy.resources.reservations.devicesconfiguration is present. - Volume mount path does not exist — Create the directory with
mkdir -p ./modelsbefore runningdocker compose up. - YAML syntax error — Use
docker compose configto validate the file. - Port conflict between services — Assign unique external ports to each service.
- Image pull failure — Verify the image exists with
docker pullbefore referencing it.
Operator checkpoint
Before treating this as solved, write down the local runtime, model or package version, hardware/backend if relevant, and the verification output. This keeps the guide useful as a Will-It-Run style decision instead of a one-off command transcript.