HOW-TO · INF
How to set up a model switching workflow for different tasks
PREREQUISITES
Multiple models downloaded, Ollama installed
What this does
Different tasks benefit from different models. This guide creates a scripted workflow that automatically routes prompts to the optimal model based on task type.
Steps
Create a routing script that selects a model by task keyword.
# switch_model.sh #!/bin/bash TASK=$1 PROMPT=$2 case $TASK in code) MODEL="deepseek-coder" ;; math) MODEL="deepseek-r1:7b" ;; writing) MODEL="llama3.2:latest" ;; summary) MODEL="mistral:latest" ;; *) MODEL="llama3.2:latest" ;; esac curl -s http://localhost:11434/api/generate \ -d "{\"model\":\"$MODEL\",\"prompt\":\"$PROMPT\",\"stream\":false}" \ | jq -r '.response'Make the script executable and test each task type.
chmod +x switch_model.sh ./switch_model.sh code "Write a Python function to reverse a linked list" ./switch_model.sh math "Solve for x: 2x^2 + 5x - 3 = 0" ./switch_model.sh writing "Write a haiku about autumn"Set up a reverse proxy for API-based switching. Using nginx, route by URL path:
server { listen 11435; location /v1/code { proxy_pass http://localhost:11434; } location /v1/writing { proxy_pass http://localhost:11434; } }For parallel serving, keep both models loaded on different ports.
# Start both servers ./llama-server -m code-model.gguf --port 8080 & ./llama-server -m writing-model.gguf --port 8081 & # Create wrapper functions code() { curl -s http://localhost:8080/completion -d "{\"prompt\":\"$1\"}"; } write() { curl -s http://localhost:8081/completion -d "{\"prompt\":\"$1\"}"; }
Verification
# Run the same prompt through different task routes and compare outputs
./switch_model.sh code "Explain HTTP"
./switch_model.sh writing "Explain HTTP"
# Expected: Code output focuses on implementation; writing output focuses on conceptual explanation
Common failures
- Cold-start latency: Models not kept in memory reload on each request. Use
ollama run <model>once to pre-load. - Task detection errors: Keyword-based routing is brittle. For production, use a lightweight classifier (e.g., zero-shot) to detect task type.
- Port conflicts with proxy: Ensure the reverse proxy port differs from the Ollama default (11434).
Related guides
RELATED GUIDES