What this does

Building a multi-agent supervisor workflow creates a hierarchical system where a supervisor agent delegates tasks to specialized sub-agents and validates their outputs. The supervisor analyzes incoming requests, routes them to the appropriate agent (researcher, coder, analyst), collects responses, and synthesizes a final answer. This architecture improves output quality by matching tasks to purpose-built agents and adding a quality-control layer.

Steps

Define each sub-agent as a class or function with a specific system prompt and tool set. For example, create ResearcherAgent with web search and document retrieval tools, CoderAgent with code execution and file system access, and AnalystAgent with data processing tools. Next, build the supervisor agent with a routing function. The routing function takes the user query and returns a structured decision: { "next_agent": "researcher", "instructions": "Search for recent papers on topic X" }. Implement the orchestration loop: supervisor receives query → selects agent → agent executes → agent returns result → supervisor evaluates → repeats or finalizes. Add a quality check step where the supervisor validates output against requirements before presenting it. Implement a shared AgentState object with fields: messages: List[Message], current_agent: str, results: Dict[str, Any], iteration: int. Set a maximum iteration count (default 10) to prevent infinite delegation loops. Serialize the workflow using the framework's graph builder—in LangGraph, define nodes for each agent and edges for routing logic. Add a human-in-the-loop checkpoint where the supervisor pauses for approval on high-stakes decisions.

Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.
Confirm the local starting state. Print the active binary, package version, model name, or configuration path before changing the workflow.
Run the smallest complete path. Execute the minimum command or script that proves the guide works end to end on the local machine.
Compare against expected output. Check the final line, status code, generated artifact, or model response against the verification section before expanding the setup.
Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.

Verification

Send a complex query requiring multiple agent skills: "Research the latest Python async libraries, write a benchmark script, and analyze the results." Verify the supervisor routes to the researcher first, then coder, then analyst in sequence. Check the final output contains research citations, executable code, and a numerical analysis. Run 5 diverse queries and confirm the routing logic selects appropriate agents each time. Inspect the AgentState at completion—all fields should be populated without error entries.

Common failures

Supervisor routes to wrong agent: Improve the routing prompt with few-shot examples of correct routing decisions. Agents produce incompatible output formats: Standardize all agent outputs to a common schema with type and content fields. Infinite delegation loop: Check the iteration counter in the supervisor loop and force termination with a fallback message. State corruption between concurrent runs: Use thread-safe data structures or isolate state per session ID. Supervisor overrides correct agent output: Adjust the quality check threshold—only flag outputs that fail explicit validation rules, not subjective quality.

Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.

Related guides

implement-human-in-the-loop-ai-agents
build-langgraph-agent-scratch
setup-agent-tool-use-function-calling