Supervisor Agent — LangGraph for Local Agents (Chapter 10)

The supervisor agent is a specialized node that uses a language model to make routing decisions, rather than hard-coded if/else logic. This makes the team more flexible—a model-based supervisor can read task descriptions, worker outputs, and decide dynamically which worker to call next without you pre-programming every branch.

from langchain_ollama import OllamaLLM
from langgraph.prebuilt import create_react_agent

llm = OllamaLLM(model="llama3.2:latest", temperature=0.3)
llm_for_supervisor = OllamaLLM(model="llama3.2:latest", temperature=0)

worker_tools = [search_web, read_file, write_code]
worker_app = create_react_agent(model=llm, tools=worker_tools)

def model_supervisor(state: SupervisorState) -> SupervisorState:
    prompt = build_supervisor_prompt(state)
    response = llm_for_supervisor.invoke(prompt)
    # Parse response to extract routing decision
    decision = parse_routing_decision(response)
    return {"next_worker": decision}

The prompt must include enough context from the shared state for the supervisor to make an informed decision. Include the original task, completed worker results, pending assignments, and a description of available workers. Without this context, the supervisor model guesses rather than reasons.

The failure mode: model-based supervisors can hallucinate routing decisions—returning a worker name that does not exist in the graph. Validate the returned decision against the set of valid worker names in the routing function:

VALID_WORKERS = {"researcher", "coder", "done"}

def safe_route(state: SupervisorState) -> Literal["researcher", "coder", "done"]:
    decision = model_supervisor(state)
    if decision not in VALID_WORKERS:
        return "done"  # fallback to termination rather than crash
    return decision

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.