20. Production Deployment

Chapter 20 of 24 · 15 min

Deploying multi-agent systems to production requires infrastructure that supports stateful agents, manages deployment versioning, and provides operational guarantees that development environments lack.

Deployment Architecture

Production deployments separate concerns: orchestration layer handles workflow management, agent pool layer handles execution, infrastructure layer provides compute and networking. This separation enables independent scaling and failure isolation.

# deployment/orchestrator.py
from dataclasses import dataclass
from typing import Optional
import hashlib

@dataclass
class DeploymentConfig:
    version: str
    agent_definitions: dict[str, dict]
    resource_limits: dict[str, dict]
    scaling_policy: dict[str, int]
    environment: str

class DeploymentManager:
    def __init__(self, container_registry: str, config_store: any):
        self.container_registry = container_registry
        self.config_store = config_store
    
    async def deploy(self, config: DeploymentConfig) -> str:
        deployment_id = self._generate_deployment_id(config.version)
        
        for agent_name, agent_def in config.agent_definitions.items():
            await self._deploy_agent_image(
                agent_name, 
                agent_def,
                deployment_id
            )
        
        await self.config_store.save_deployment(
            deployment_id,
            {
                "version": config.version,
                "agents": config.agent_definitions,
                "limits": config.resource_limits,
                "scaling": config.scaling_policy
            }
        )
        
        await self._update_load_balancer(deployment_id, config)
        
        return deployment_id
    
    def _generate_deployment_id(self, version: str) -> str:
        return hashlib.sha256(
            f"{version}_{__import__('time').time()}".encode()
        ).hexdigest()[:16]
    
    async def rollback(self, target_version: str):
        config = await self.config_store.get_deployment(target_version)
        await self.deploy(DeploymentConfig(
            version=target_version,
            agent_definitions=config["agents"],
            resource_limits=config["limits"],
            scaling_policy=config["scaling"]
        ))

Rollout Strategies

Multi-agent deployments use canary releases and blue-green deployments to minimize risk. Canary releases route small traffic percentages to new versions, monitoring error rates before full promotion.

Health Checks and Readiness

Agents expose health endpoints that orchestration systems poll. Readiness probes indicate whether an agent can accept traffic; liveness probes indicate whether an agent requires restart.

Secrets and Configuration Injection

Production secrets inject at runtime via secure channels. Configuration changes propagate through the system without agent restarts where possible.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

EXERCISE

Implement a deployment manifest generator that produces Kubernetes-style deployment configurations from a simplified agent specification format, including resource limits, health checks, and scaling policies.