RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Function Calling for Local Models
  6. /Ch. 7
Function Calling for Local Models

07. Multi-Tool Orchestration

Chapter 7 of 18 · 20 min
KEY INSIGHT

Multi-tool orchestration uses iteration loops with message accumulation—each tool result informs subsequent model decisions.

Multi-tool orchestration manages sequences where one tool call depends on another's output. The model decides which tools to call and in what order based on the user goal.

Implement a loop that continues until the model produces a text response:

def multi_tool_orchestration(
    model: str,
    messages: list[dict],
    tools: list[dict],
    registry: ToolRegistry,
    max_iterations: int = 10
) -> str:
    iteration = 0
    
    while iteration < max_iterations:
        iteration += 1
        
        # Get model response
        response = call_ollama_with_tools(model, messages, tools)
        
        # Extract tool calls
        tool_calls = extract_tool_calls(response)
        
        if not tool_calls:
            # Model returned natural language response
            return response["message"]["content"]
        
        # Execute each tool call
        for call in tool_calls:
            schema = registry.get_schema(call["name"])
            
            # Record the call
            messages.append({
                "role": "assistant",
                "content": None,
                "tool_calls": [{
                    "id": f"call_{call['name']}_{iteration}",
                    "type": "function",
                    "function": {
                        "name": call["name"],
                        "arguments": json.dumps(call["arguments"])
                    }
                }]
            })
            
            # Execute and record result
            result = execute_with_retry(call["name"], call["arguments"], schema)
            
            messages.append({
                "role": "tool",
                "tool_call_id": f"call_{call['name']}_{iteration}",
                "content": json.dumps(result)
            })
    
    return "Maximum iterations reached without completion"

The max_iterations guard prevents infinite loops when models get stuck calling tools without making progress. Set this based on typical task complexity—simple queries need 2-3 iterations, complex reasoning may need 8-10.

State management matters for multi-tool sequences. Track tool call history and intermediate results:

class OrchestrationState:
    def __init__(self):
        self.messages = []
        self.execution_history = []
        self.final_result = None
    
    def add_tool_result(self, name: str, arguments: dict, result: dict):
        self.execution_history.append({
            "tool": name,
            "arguments": arguments,
            "result": result,
            "success": result.get("success", False)
        })
    
    def get_successful_tools(self) -> list[str]:
        return [h["tool"] for h in self.execution_history if h["success"]]

Dependency tracking handles tools that require previous tool outputs:

def resolve_argument_dependencies(
    arguments: dict,
    execution_history: list[dict]
) -> dict:
    resolved = {}
    
    for key, value in arguments.items():
        if isinstance(value, str) and value.startswith("$"):
            # Reference to previous tool output
            tool_index = int(value.split("[")[1].split("]")[0])
            field = value.split(".")[1] if "." in value else "result"
            
            if tool_index < len(execution_history):
                resolved[key] = execution_history[tool_index]["result"].get(field)
            else:
                raise ValueError(f"Invalid tool reference: {value}")
        else:
            resolved[key] = value
    
    return resolved
EXERCISE

Build a multi-tool orchestration system with a web search tool and a content summarizer. Create a query requiring both tools in sequence. Track execution history and verify the model uses the first tool's output in the second call.

← Chapter 6
Single Tool Execution
Chapter 8 →
Parallel Tool Calls