16. CI/CD Pipeline

Chapter 16 of 24 · 20 min

KEY INSIGHT

Pipeline design must handle both application code updates and model artifact updates as first-class citizens, with separate validation stages for each artifact type. ### GitHub Actions Pipeline ```yaml # .github/workflows/inference-deploy.yml name: Inference Model CI/CD on: push: branches: [main] paths: - 'models/**' - 'src/**' - 'Dockerfile' - 'requirements.txt' pull_request: branches: [main] env: REGISTRY: registry.internal IMAGE_NAME: inference-server MODEL_REGISTRY: ```s3://model-artifacts/``` jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 - name: Build application image uses: docker/build-push-action@v5 with: context: ./src push: false tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:test cache-from: type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest - name: Run unit tests run: | docker run --rm ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:test \ pytest tests/unit -v validate-model: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Download model artifacts run: | aws s3 sync ${{ env.MODEL_REGISTRY }}/staging/ ./models/ - name: Validate model schema run: | python scripts/validate_model.py \ --model-dir ./models \ --expected-input "input_ids:float32[?,512]" \ --expected-output "logits:float32[?,512,vocab_size]" - name: Benchmark model performance run: | python scripts/benchmark.py \ --model ./models/model.pt \ --batch-sizes 1,4,8,16 \ --target-throughput 100 deploy-staging: needs: [build, validate-model] runs-on: ubuntu-latest environment: staging steps: - name: Deploy to staging run: | kubectl set image deployment/inference-server \ app=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \ --namespace=staging ``` ### Model Registry Integration Automate model promotion through stages based on validation results: ```python # scripts/promote_model.py import boto3 def promote_model(model_name: str, from_stage: str, to_stage: str): s3 = boto3.client('s3') bucket = 'model-artifacts' # Get model metadata metadata_key = f"{from_stage}/{model_name}/metadata.json" metadata = s3.get_object(Bucket=bucket, Key=metadata_key) # Check validation results validation_passed = ( metadata['Cors'] == 'PASSED' and metadata['Benchmark'] == 'PASSED' ) if not validation_passed: raise ValueError(f"Model {model_name} validation incomplete") # Copy to target stage copy_source = {'Bucket': bucket, 'Key': f"{from_stage}/{model_name}"} s3.copy(copy_source, bucket, f"{to_stage}/{model_name}") # Update latest pointer s3.put_object( Bucket=bucket, Key=f"latest/{model_name}", Body=f"{to_stage}/{model_name}".encode() ) ```

Continuous integration and deployment pipelines ensure that model updates, container rebuilds, and configuration changes deploy consistently across environments without manual intervention that introduces human error.

EXERCISE

Configure a GitHub Actions pipeline that builds a Docker image on code changes, validates model artifacts on changes to the models directory, and deploys to a local Kubernetes cluster on main branch merges. Include separate jobs for build, validation, and deployment stages with appropriate dependencies.