RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /How-to
  5. /How to create a Docker Compose setup for AI stack
HOW-TO · SET

How to create a Docker Compose setup for AI stack

intermediate·20 min·By Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.xWindows 11 · Ollama 0.4.xmacOS 15 · Ollama 0.4.x
PREREQUISITES

Docker and Docker Compose installed

What this does

Defines a multi-service Docker Compose file that orchestrates an inference server alongside supporting services, enabling repeatable startup of a complete local AI deployment with a single command.

Steps

  1. Create the project directory and docker-compose.yml file.

    mkdir ai-stack && cd ai-stack
    touch docker-compose.yml
    

    Expected output: Directory created, empty compose file ready for editing.

  2. Define the inference service with GPU access and volume mounts.

    services:
      inference:
        image: vllm/vllm-openai:latest
        deploy:
          resources:
            reservations:
              devices:
                - driver: nvidia
                  count: all
                  capabilities: [gpu]
        volumes:
          - ./models:/models:ro
        ports:
          - "8000:8000"
        command: --model /models/llama-model --gpu-memory-utilization 0.85
    

    Expected output: Valid YAML parsed without error by docker compose config.

  3. Add an Ollama service for model management.

    ollama:
      image: ollama/ollama:latest
      volumes:
        - ollama-data:/root/.ollama
      ports:
        - "11434:11434"
    volumes:
      ollama-data:
    

    Expected output: Configuration included without syntax errors.

  4. Start all services in detached mode.

    docker compose up -d
    

    Expected output: Container list shows all services with status "running".

Verification

docker compose ps
# Expected: table listing all services with "running" state and mapped ports

Common failures

  • GPU not available to container — Ensure the deploy.resources.reservations.devices configuration is present.
  • Volume mount path does not exist — Create the directory with mkdir -p ./models before running docker compose up.
  • YAML syntax error — Use docker compose config to validate the file.
  • Port conflict between services — Assign unique external ports to each service.
  • Image pull failure — Verify the image exists with docker pull before referencing it.

Operator checkpoint

Before treating this as solved, write down the local runtime, model or package version, hardware/backend if relevant, and the verification output. This keeps the guide useful as a Will-It-Run style decision instead of a one-off command transcript.

Related guides

  • How to run Ollama in Docker
  • How to run vLLM in Docker
  • Course Local AI Fundamentals
RELATED GUIDES
SET
How to run vLLM in Docker
SET
How to run Ollama in Docker
← All how-to guidesCourses →