Dockerfile Optimization — Production Local AI Deployment (Chapter 2)

Docker images accumulate layers with each instruction in a Dockerfile. Layer count and layer content directly impact build time, pull time, and storage costs. Optimizing Dockerfiles requires understanding how BuildKit caches layers, how to minimize image size, and how to structure instructions for maximum cache hit rate.

The first optimization involves .dockerignore. Files copied into images should serve a purpose. Build artifacts, local development files, node_modules, and git directories waste bandwidth when copied and must be excluded explicitly.

Instruction ordering matters for caching. Instructions that change frequently should appear toward the end. System packages change less frequently than application code. The Dockerfile pattern installs dependencies first, copies source code second, and compiles third, with cleanup commands following at the end.

Layer caching失效 occurs when any instruction in a cached sequence changes. Copying package-lock.json before source code means dependency installation remains cached when only source code changes. Changing that order breaks the cache for dependency installation.

Minimizing image size reduces attack surface and pull times. Base images with minimal packages reduce vulnerability exposure. Removing package manager caches, build dependencies, and temporary files within the same layer where they get created prevents layer bloat.

Distroless and alpine base images provide smaller footprints than ubuntu or debian. Application images rarely need shell access, package managers, or system debugging tools. Distroless variants provide only the runtime and application.

Multi-stage builds address size concerns by separating build-time dependencies from runtime environment. The build stage installs compilers, build tools, and source files. The runtime stage copies only compiled artifacts. Final image size drops dramatically when compilers and intermediate files disappear.

# Stage 1: Build dependencies FROM python:3.11-slim as builder WORKDIR /build # Install dependencies first for maximum cache hit COPY requirements.txt . RUN pip install --user --no-cache-dir -r requirements.txt # Copy source after dependencies COPY inference_engine/ ./inference_engine/ # Stage 2: Runtime image FROM python:3.11-slim WORKDIR /app # Copy only the artifacts needed for runtime COPY --from=builder /root/.local/lib /app/lib COPY --from=builder inference_engine/ /app/inference_engine/ # Non-root user for security RUN useradd --create-home appuser && chown -R appuser /app USER appuser CMD ["python", "-m", "inference_engine.server"]