05. Function Calling in Ollama

Chapter 5 of 16 · 20 min

Ollama supports tool calling through the chat completion API by passing a tools parameter. The model generates structured tool calls alongside text responses, and the client is responsible for dispatching them and returning results.

Basic Ollama tool calling

import ollama

def chat_with_tools(model: str, messages: list, tools: list):
    response = ollama.chat(
        model=model,
        messages=messages,
        tools=tools,
        stream=False
    )
    return response

tools = [
    {
        "type": "function",
        "function": {
            "name": "calculator",
            "description": "Evaluate a mathematical expression. Input: Python expression string.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string"
                    }
                },
                "required": ["expression"]
            }
        }
    }
]

messages = [{"role": "user", "content": "What is 17 * 23?"}]
response = chat_with_tools("llama3.2", messages, tools)

print(response.message.tool_calls)
# Output: [{'function': {'name': 'calculator', 'arguments': {'expression': '17 * 23'}}}]

When the model decides to call a tool, response.message.tool_calls contains the list. When it does not, this field is None.

Executing and feeding back results

if response.message.tool_calls:
    for call in response.message.tool_calls:
        fn = call.function
        if fn.name == "calculator":
            result = eval(fn.arguments["expression"])
            messages.append({
                "role": "assistant",
                "content": "",
                "tool_calls": [call]
            })
            messages.append({
                "role": "tool",
                "tool_call_id": call.id,
                "content": str(result)
            })

    # Continue the conversation with results
    follow_up = ollama.chat(model="llama3.2", messages=messages, tools=tools)
    print(follow_up.message.content)

Models that support tool calling

Tool calling in Ollama requires models that were fine-tuned for it. Llama 3.1 and 3.2 models have this capability. Some quantized variants may perform poorly. Test with the full-precision model first before experimenting with smaller variants.

Streaming with tool calls

Ollama supports streaming responses even with tool calls:

stream = ollama.chat(
    model="llama3.2",
    messages=messages,
    tools=tools,
    stream=True
)

for chunk in stream:
    if chunk.message.tool_calls:
        print("Tool call generated during stream")
EXERCISE

Start the Ollama server, define the calculator tool, and test a multi-step calculation that requires chaining tool calls (e.g., "What is the square root of (15 + 10) multiplied by 7?").