Ollama Function Calling — Function Calling for Local Models (Chapter 4)

Ollama provides function calling through its message format and structured output support. The implementation uses the /api/chat endpoint with a specific message structure.

Ollama's function calling requires the model to be one that supports this capability. Models like Llama 3.1, Mistral, and Command-R have function calling trained into their instruction following. Check model documentation or test with a simple tool call to verify support.

Define tools in the request payload:

import requests
import json

def call_ollama_with_tools(model: str, messages: list, tools: list) -> dict:
    url = "http://localhost:11434/api/chat"
    
    payload = {
        "model": model,
        "messages": messages,
        "tools": tools,
        "stream": False
    }
    
    response = requests.post(url, json=payload)
    response.raise_for_status()
    return response.json()

The tools parameter accepts an array of tool definitions in the standard JSON Schema format. Ollama handles the formatting instructions internally and guides the model to produce valid calls.

Extract tool calls from the response:

def extract_tool_calls(response: dict) -> list[dict]:
    tool_calls = []
    
    for message in response.get("message", {}):
        if message.get("tool_calls"):
            for call in message["tool_calls"]:
                tool_calls.append({
                    "name": call["function"]["name"],
                    "arguments": call["function"]["arguments"]
                })
    
    return tool_calls

# Example response parsing
response = call_ollama_with_tools(
    "llama3.1:8b",
    [{"role": "user", "content": "What's the weather in Boston?"}],
    [weather_tool_schema]
)

tool_calls = extract_tool_calls(response)
for call in tool_calls:
    print(f"Tool: {call['name']}, Args: {call['arguments']}")

Ollama returns tool calls in the tool_calls field of the message object. Each call contains the function name and a JSON string of arguments that must be parsed before execution.

Streaming function calls work with the stream: true option, though parsing streaming responses requires handling partial JSON. Consider using non-streaming for initial implementation and optimizing to streaming only after testing the non-streaming path thoroughly.

Common failures with Ollama function calling include model incompatibility (some models lack tool-use training), malformed JSON in arguments (models may struggle with complex schemas), and context length overflow (large tool definitions consume prompt space). Monitor these failure modes and implement appropriate fallback strategies.