RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Function Calling for Local Models
  6. /Ch. 6
Function Calling for Local Models

06. Single Tool Execution

Chapter 6 of 18 · 20 min
KEY INSIGHT

Single tool execution requires careful message formatting—append both the tool call and its result before the next model call.

Single tool execution forms the basic building block for function calling systems. The flow processes one user request, extracts one tool call, executes it, and returns results.

Build the execution loop:

import json

def single_tool_execution_loop(
    model: str,
    messages: list[dict],
    tools: list[dict],
    registry: ToolRegistry
) -> str:
    # Call model with current context
    response = call_ollama_with_tools(model, messages, tools)
    
    # Extract tool calls
    tool_calls = extract_tool_calls(response)
    
    if not tool_calls:
        # No tool call, return model text directly
        return response["message"]["content"]
    
    # Execute the tool (single tool case)
    call = tool_calls[0]
    schema = registry.get_schema(call["name"])
    
    # Validate and execute
    execution_result = execute_tool(
        call["name"],
        call["arguments"],
        schema
    )
    
    # Add tool result to message history
    messages.append({
        "role": "assistant",
        "content": None,
        "tool_calls": [{
            "id": f"call_{call['name']}",
            "type": "function",
            "function": {
                "name": call["name"],
                "arguments": json.dumps(call["arguments"])
            }
        }]
    })
    
    messages.append({
        "role": "tool",
        "tool_call_id": f"call_{call['name']}",
        "content": json.dumps(execution_result)
    })
    
    # Get final response with tool result
    final_response = call_ollama_with_tools(model, messages, tools)
    return final_response["message"]["content"]

The message format follows the tool-use chat template. The assistant message includes the tool call, and the tool result arrives in a subsequent message with the tool_call_id for linking.

Handle execution failures:

def handle_tool_error(tool_name: str, error: Exception) -> dict:
    return {
        "success": False,
        "error": str(error),
        "error_type": type(error).__name__,
        "recoverable": isinstance(error, (TimeoutError, ConnectionError))
    }

# In execution loop
try:
    result = registry.execute(call["name"], call["arguments"])
    execution_result = {"success": True, "result": result}
except Exception as e:
    execution_result = handle_tool_error(call["name"], e)

Error messages returned to the model should be descriptive enough for the model to attempt correction but not so detailed that they consume context. Include the error type and a brief description.

Retry logic handles transient failures:

def execute_with_retry(
    name: str,
    arguments: dict,
    schema: dict,
    max_retries: int = 3
) -> dict:
    for attempt in range(max_retries):
        result = execute_tool(name, arguments, schema)
        
        if result.get("success"):
            return result
        
        # Check if error is recoverable
        if not result.get("recoverable", True):
            return result
        
        # Exponential backoff
        time.sleep(2 ** attempt)
    
    return {"success": False, "error": "Max retries exceeded"}
EXERCISE

Build a single tool execution system with a weather lookup tool. Test with valid inputs, invalid city names, and missing required arguments. Verify error messages reach the model correctly.

← Chapter 5
vLLM Function Calling
Chapter 7 →
Multi-Tool Orchestration