RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Ollama — Installation to Mastery
  6. /Ch. 13
Ollama — Installation to Mastery

13. Docker Deployment

Chapter 13 of 20 · 20 min
KEY INSIGHT

Docker volumes persist models between container lifecycles. Without a volume, every `docker run` starts with no models installed.

Running Ollama in Docker isolates it from the host system and simplifies deployment. GPU passthrough requires the nvidia-container-toolkit on Linux or GPU support enabled in Docker Desktop on Windows.

Basic Docker Setup

# Pull and run the official image
docker run -d --name ollama -p 11434:11434 ollama/ollama

This starts Ollama in detached mode. Access the API at http://localhost:11434.

Persisting Models

Models downloaded inside the container are lost when the container is removed. Mount a volume to persist them:

docker run -d --name ollama \\
    -p 11434:11434 \\
    -v ollama-data:/root/.ollama \\
    ollama/ollama

The -v ollama-data:/root/.ollama flag maps the container's model storage to a Docker volume. Data survives container restarts and upgrades.

Running Specific Models

# Pull a model inside the container
docker exec ollama ollama pull llama3.2:1b

# Run the model
docker exec -it ollama ollama run llama3.2:1b

Or use the API directly:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.2:1b",
  "prompt": "Hello"
}'

GPU Passthrough (NVIDIA)

Requires nvidia-container-toolkit installed on the host:

docker run -d --name ollama \\
    --gpus all \\
    -p 11434:11434 \\
    -v ollama-data:/root/.ollama \\
    ollama/ollama

The --gpus all flag passes through all NVIDIA GPUs. Verify with:

docker exec ollama nvidia-smi

GPU Passthrough (AMD)

AMD GPUs require ROCm-enabled images. Use the ollama/ollama:rocm tag:

docker run -d --name ollama \\
    --device /dev/kfd \\
    --device /dev/dri \\
    -p 11434:11434 \\
    -v ollama-data:/root/.ollama \\
    ollama/ollama:rocm

Resource Limits

Constrain CPU and memory usage:

docker run -d --name ollama \\
    --cpus="2" \\
    --memory="4g" \\
    -p 11434:11434 \\
    ollama/ollama

This limits the container to 2 CPU cores and 4 GB of RAM. Useful for shared environments.

Failure Modes

  • GPU not detected in container - Verify nvidia-smi works on the host, then check that the container uses the correct runtime (docker run --runtime=nvidia ...).
  • Port conflict - Another service on the host uses port 11434. Use -p 11435:11434 to map to a different port.
  • Volume permission issues - The ollama user inside the container may not have permissions on the host directory. Use named volumes (like ollama-data) instead of host paths.
EXERCISE

Start Ollama in Docker, pull a model, stop the container, start a new container with the same volume, and verify the model is still available with ollama list.

← Chapter 12
Ollama as Systemd Service
Chapter 14 →
Docker Compose Stack