How to build custom Docker image for AI models
Docker installed, AI model files in GGUF or compatible format, base image chosen
What this does
Packages a customized AI model and its runtime into a self-contained Docker image that runs anywhere Docker is available. After this guide the image is built, tagged, and confirmed to load the model successfully.
Steps
Create the Dockerfile. The Dockerfile defines the base runtime, model injection point, and startup behavior.
FROM ollama/ollama:latest COPY ./models/llama3-8b-instruct-q4_K_M.gguf /root/.ollama/models/ CMD ["serve"]Place this file alongside a
models/directory containing the GGUF file.Build the image with an appropriate tag.
docker build -t ollama-custom:llama3-8b-q4 .Expected output: Docker build finishes with
Successfully built <image-id>andSuccessfully tagged ollama-custom:llama3-8b-q4.Run and validate the container.
docker run -d --name ollama-custom -p 11434:11434 ollama-custom:llama3-8b-q4 sleep 15 && curl -s http://localhost:11434/api/tagsExpected output: JSON listing model names available in the container.
Push to a registry if needed. For multi-host deployments.
docker tag ollama-custom:llama3-8b-q4 myregistry.example.com/ollama-custom:llama3-8b-q4 docker push myregistry.example.com/ollama-custom:llama3-8b-q4Expected output: Layer upload progress followed by
Pushedconfirmation.
Verification
docker run --rm ollama-custom:llama3-8b-q4 ollama list
# Expected: Table row listing the injected model with size and quantization
Common failures
COPY failed: file not found— The model file is not inmodels/relative to the build context. Verify withls models/from the build directory.connection refusedwhen querying the API — The container failed to start. Rundocker logs <container-id>to inspect the startup error.exec format error— Architecture mismatch. Verify withdocker inspect <image> | grep Architecture.- Model not listed after build — Ollama does not auto-import copied GGUF files. Use
OLLAMA_MODELSenv var or runollama createinside the container. - Image too large — The model file bloats the Docker image. Use a
.dockerignoreto exclude source files and multi-stage builds to minimize layers.