What this does

Different tasks benefit from different models. This guide creates a scripted workflow that automatically routes prompts to the optimal model based on task type.

Steps

Create a routing script that selects a model by task keyword.

# switch_model.sh
#!/bin/bash
TASK=$1
PROMPT=$2
case $TASK in
    code)    MODEL="deepseek-coder" ;;
    math)    MODEL="deepseek-r1:7b" ;;
    writing) MODEL="llama3.2:latest" ;;
    summary) MODEL="mistral:latest" ;;
    *)       MODEL="llama3.2:latest" ;;
esac
curl -s http://localhost:11434/api/generate \
    -d "{\"model\":\"$MODEL\",\"prompt\":\"$PROMPT\",\"stream\":false}" \
    | jq -r '.response'

Make the script executable and test each task type.

chmod +x switch_model.sh
./switch_model.sh code "Write a Python function to reverse a linked list"
./switch_model.sh math "Solve for x: 2x^2 + 5x - 3 = 0"
./switch_model.sh writing "Write a haiku about autumn"

Set up a reverse proxy for API-based switching. Using nginx, route by URL path:

server {
    listen 11435;
    location /v1/code {
        proxy_pass http://localhost:11434;
    }
    location /v1/writing {
        proxy_pass http://localhost:11434;
    }
}

For parallel serving, keep both models loaded on different ports.

# Start both servers
./llama-server -m code-model.gguf --port 8080 &
./llama-server -m writing-model.gguf --port 8081 &
# Create wrapper functions
code() { curl -s http://localhost:8080/completion -d "{\"prompt\":\"$1\"}"; }
write() { curl -s http://localhost:8081/completion -d "{\"prompt\":\"$1\"}"; }

Verification

# Run the same prompt through different task routes and compare outputs
./switch_model.sh code "Explain HTTP"
./switch_model.sh writing "Explain HTTP"
# Expected: Code output focuses on implementation; writing output focuses on conceptual explanation

Common failures

Cold-start latency: Models not kept in memory reload on each request. Use ollama run <model> once to pre-load.
Task detection errors: Keyword-based routing is brittle. For production, use a lightweight classifier (e.g., zero-shot) to detect task type.
Port conflicts with proxy: Ensure the reverse proxy port differs from the Ollama default (11434).

How to set up a model switching workflow for different tasks

What this does

Steps

Verification

Common failures

Related guides