Agent Loop — Custom Agent Frameworks (Chapter 3)

The agent loop is the heart of the runtime. It's deceptively simple: generate, execute, evaluate, repeat. But the details—how you handle partial outputs, tool failures, context overflow—determine whether your agent is reliable or brittle.

Loop phases:

async def step(self, iteration: int) -> LoopPhase:
    # Phase 1: Generate
    response = await self.llm.chat(
        messages=self.memory.get_recent(max_tokens=self.llm.context_window - 500),
        tools=self.tools.schemas()
    )
    
    # Phase 2: Execute (if tool calls present)
    if response.tool_calls:
        results = await self.tools.execute_batch(response.tool_calls)
        self.memory.add_tool_results(response.tool_calls, results)
        return LoopPhase.TOOLS_EXECUTED
    
    # Phase 3: Evaluate (no tools = final response or error)
    if response.finish_reason == "stop":
        final = response.content
        self.memory.add_message(role="assistant", content=final)
        return LoopPhase.COMPLETED
    
    return LoopPhase.ERROR

The loop terminates in three ways: the model signals completion (finish_reason="stop"), tools were executed and we loop again, or an error condition (empty response, API failure, context overflow).

Failure mode: context overflow. If memory accumulates too much context, the LLM input exceeds its context window. You must implement truncation strategy—typically keeping the most recent messages plus a system prompt, or implementing semantic compression. The code above uses get_recent with a token budget, but this loses important earlier context. Chapter 9 covers smarter approaches.

Failure mode: tool execution failures. If a tool times out or throws an exception, you need error handling that informs the next generation. Don't silently swallow failures:

async def execute_batch(self, calls: list[ToolCall]) -> list[ToolResult]:
    results = []
    for call in calls:
        try:
            result = await asyncio.wait_for(
                self.tools.get(call.name)(**call.arguments),
                timeout=30.0
            )
            results.append(ToolResult(success=True, output=result))
        except asyncio.TimeoutError:
            results.append(ToolResult(
                success=False,
                error="Tool execution timed out after 30 seconds"
            ))
        except Exception as e:
            results.append(ToolResult(
                success=False,
                error=f"Tool execution failed: {type(e).__name__}: {str(e)}"
            ))
    return results

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.