Multi-Agent Research Team Project — LangGraph for Local Agents (Chapter 18)

This final chapter assembles everything from the previous 17 chapters into a single production-style multi-agent research team. The team architecture: a ResearchTeam with a model-based supervisor, two specialized workers (web_searcher and report_writer), persistent checkpointing to SQLite, human-in-the-loop approval before report finalization, and structured streaming output.

START → supervisor → [conditional routing]
                        ├── web_searcher → supervisor (loop)
                        ├── report_writer → supervisor (loop) → END
                        └── done → END

The supervisor uses an Ollama model to decide which worker to invoke next based on the shared state. Each worker is a ReAct subgraph with its own tools. Checkpointing to SQLite enables session resumption. The interrupt before finalization pauses for human approval.

from langgraph.graph import StateGraph, END, START
from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.prebuilt import create_react_agent
from langgraph.types import Command
from langchain_ollama import OllamaLLM
from typing import TypedDict, Literal

class ResearchState(TypedDict):
    query: str
    search findinds: Annotated[list[str], add]
    draft_report: str | None
    approved: bool | None
    next_action: Literal["web_searcher", "report_writer", "finalize", "done"]
    messages: Annotated[list[BaseMessage], add]

llm = OllamaLLM(model="llama3.2:latest", temperature=0)

# Workers
searcher = create_react_agent(
    model=llm,
    tools=[web_search, save_findings],
    state_schema=ResearchState
)
writer = create_react_agent(
    model=llm,
    tools=[write_file, read_file],
    state_schema=ResearchState
)

# Supervisor prompt
SUPERVISOR_PROMPT = """You are the research team supervisor.

Task: {query}
Completed findings: {findings}
Draft: {draft_report}

Decide next action:
- "web_searcher" if more research is needed
- "report_writer" if you have enough to draft or revise the report
- "finalize" if the report is complete and ready for review
- "done" if no more work is needed
"""

def supervisor_node(state: ResearchState) -> ResearchState:
    prompt = SUPERVISOR_PROMPT.format(
        query=state["query"],
        findings=state["findings"],
        draft_report=state["draft_report"] or "No draft yet"
    )
    response = llm.invoke(prompt)
    decision = parse_action(response)
    if decision not in {"web_searcher", "report_writer", "finalize", "done"}:
        decision = "done"
    return {"next_action": decision}

def approval_router(state: ResearchState) -> Literal["finalize", END]:
    return "finalize" if state.get("approved") else END

builder = StateGraph(ResearchState)
builder.add_node("supervisor", supervisor_node)
builder.add_node("web_searcher", searcher)
builder.add_node("report_writer", writer)
builder.add_node("finalize", finalize_node)

# Routing
builder.add_edge(START, "supervisor")
builder.add_conditional_edges(
    "supervisor",
    lambda s: s["next_action"],
    {
        "web_searcher": "web_searcher",
        "report_writer": "report_writer",
        "finalize": "finalize",
        "done": END
    }
)
builder.add_edge("web_searcher", "supervisor")  # Loop back
builder.add_edge("report_writer", "supervisor")  # Loop back
builder.add_conditional_edges(
    "finalize",
    approval_router,
    {"finalize": "finalize", "done": END}
)

checkpointer = SqliteSaver.from_conn_string("./research_team.db")
app = builder.compile(checkpointer=checkpointer, interrupt_before=["finalize"])

To run the team:

config = {"configurable": {"thread_id": "research_session_1"}}
initial_state = {
    "query": "What are the latest developments in local LLM inference optimization?",
    "findings": [],
    "draft_report": None,
    "approved": None,
    "next_action": "web_searcher",
    "messages": []
}

for step in app.stream(initial_state, config=config, stream_mode="updates"):
    print(f"Step: {list(step.keys())}")

# Pause at finalize. Human reviews. Resume:
app.invoke(Command(resume={"approved": True}), config=config)

The assembled project demonstrates checkpointed multi-agent loops, model-based routing, subgraph workers, human-in-the-loop interrupts, and structured streaming—all running against local Ollama models with no cloud dependency. Extend this architecture by adding additional specialized workers (code reviewer, fact-checker, citation generator), replacing SQLite with Postgres for multi-machine deployment, or wrapping the app in a FastAPI service with SSE streaming.