11. Model Management Automation

Chapter 11 of 20 · 20 min

Automating model management reduces manual effort and ensures consistent configurations across environments. Scripts can pull models on schedule, clean up old versions, and deploy custom Modelfiles.

Pulling Models with Retry Logic

Network interruptions can corrupt downloads. Implement retry logic in scripts:

#!/bin/bash
MODEL=${1:-llama3.2:1b}
MAX_RETRIES=3
RETRY_DELAY=10

for i in $(seq 1 $MAX_RETRIES); do
    echo "Attempt $i: Pulling $MODEL"
    if ollama pull "$MODEL"; then
        echo "Success"
        exit 0
    fi
    echo "Failed. Retrying in $RETRY_DELAY seconds..."
    sleep $RETRY_DELAY
    RETRY_DELAY=$((RETRY_DELAY * 2))
done

echo "Failed after $MAX_RETRIES attempts"
exit 1

Save as pull-model.sh and run with ./pull-model.sh codellama:7b.

Automating Modelfile Deployment

import subprocess
import os

def deploy_model(model_name: str, modelfile_path: str):
    """Create or update a model from a Modelfile."""
    if not os.path.exists(modelfile_path):
        raise FileNotFoundError(f"Modelfile not found: {modelfile_path}")
    
    # Pull base model first
    base_model = extract_base_model(modelfile_path)
    subprocess.run(["ollama", "pull", base_model], check=True)
    
    # Create custom model
    result = subprocess.run(
        ["ollama", "create", model_name, "-f", modelfile_path],
        capture_output=True,
        text=True
    )
    if result.returncode != 0:
        print(f"Error: {result.stderr}")
        raise RuntimeError("Model creation failed")
    print(f"Model {model_name} deployed successfully")

def extract_base_model(modelfile_path: str) -> str:
    """Parse Modelfile to get the FROM instruction."""
    with open(modelfile_path, 'r') as f:
        for line in f:
            if line.startswith('FROM '):
                return line.replace('FROM ', '').strip()
    raise ValueError("No FROM instruction found in Modelfile")

Scheduled Cleanup

Use cron (Linux/macOS) or Task Scheduler (Windows) to remove unused models:

# Linux crontab - remove models not used in 30 days
0 2 * * * find ~/.ollama/models -type d -mtime +30 -exec ollama rm $(basename {}) \\; 2>/dev/null

# Windows PowerShell scheduled task
$cutoff = (Get-Date).AddDays(-30)
Get-ChildItem ~/.ollama/models -Directory | Where-Object { $_.LastWriteTime -lt $cutoff } | ForEach-Object {
    ollama rm $_.Name
}

Health Check Script

#!/bin/bash
# health-check.sh - Verify Ollama is operational

MODEL=${OLLAMA_CHECK_MODEL:-llama3.2:1b}
TIMEOUT=30

# Check if server is running
if ! curl -s --max-time 5 http://localhost:11434 > /dev/null; then
    echo "ERROR: Ollama server not responding"
    exit 1
fi

# Check if model exists
if ! ollama list | grep -q "^$MODEL"; then
    echo "WARNING: Model $MODEL not found, pulling..."
    ollama pull "$MODEL"
fi

# Test generation
start=$(date +%s)
response=$(curl -s --max-time $TIMEOUT http://localhost:11434/api/generate \\
    -d "{\\"model\\":\\"$MODEL\\",\\"prompt\\":\\"test\\",\\"stream\\":false}")
end=$(date +%s)

if echo "$response" | grep -q '"error"'; then
    echo "ERROR: Generation failed - $response"
    exit 1
fi

duration=$((end - start))
echo "OK: Ollama operational, generation took ${duration}s"

Run this from a monitoring system to alert on failures.

EXERCISE

Write a Python script that checks if a specific model is installed, and if not, pulls it before returning success. Handle subprocess.CalledProcessError for network failures.