18. Multi-Agent Research Team Project
This final chapter assembles everything from the previous 17 chapters into a single production-style multi-agent research team. The team architecture: a ResearchTeam with a model-based supervisor, two specialized workers (web_searcher and report_writer), persistent checkpointing to SQLite, human-in-the-loop approval before report finalization, and structured streaming output.
START → supervisor → [conditional routing]
├── web_searcher → supervisor (loop)
├── report_writer → supervisor (loop) → END
└── done → END
The supervisor uses an Ollama model to decide which worker to invoke next based on the shared state. Each worker is a ReAct subgraph with its own tools. Checkpointing to SQLite enables session resumption. The interrupt before finalization pauses for human approval.
from langgraph.graph import StateGraph, END, START
from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.prebuilt import create_react_agent
from langgraph.types import Command
from langchain_ollama import OllamaLLM
from typing import TypedDict, Literal
class ResearchState(TypedDict):
query: str
search findinds: Annotated[list[str], add]
draft_report: str | None
approved: bool | None
next_action: Literal["web_searcher", "report_writer", "finalize", "done"]
messages: Annotated[list[BaseMessage], add]
llm = OllamaLLM(model="llama3.2:latest", temperature=0)
# Workers
searcher = create_react_agent(
model=llm,
tools=[web_search, save_findings],
state_schema=ResearchState
)
writer = create_react_agent(
model=llm,
tools=[write_file, read_file],
state_schema=ResearchState
)
# Supervisor prompt
SUPERVISOR_PROMPT = """You are the research team supervisor.
Task: {query}
Completed findings: {findings}
Draft: {draft_report}
Decide next action:
- "web_searcher" if more research is needed
- "report_writer" if you have enough to draft or revise the report
- "finalize" if the report is complete and ready for review
- "done" if no more work is needed
"""
def supervisor_node(state: ResearchState) -> ResearchState:
prompt = SUPERVISOR_PROMPT.format(
query=state["query"],
findings=state["findings"],
draft_report=state["draft_report"] or "No draft yet"
)
response = llm.invoke(prompt)
decision = parse_action(response)
if decision not in {"web_searcher", "report_writer", "finalize", "done"}:
decision = "done"
return {"next_action": decision}
def approval_router(state: ResearchState) -> Literal["finalize", END]:
return "finalize" if state.get("approved") else END
builder = StateGraph(ResearchState)
builder.add_node("supervisor", supervisor_node)
builder.add_node("web_searcher", searcher)
builder.add_node("report_writer", writer)
builder.add_node("finalize", finalize_node)
# Routing
builder.add_edge(START, "supervisor")
builder.add_conditional_edges(
"supervisor",
lambda s: s["next_action"],
{
"web_searcher": "web_searcher",
"report_writer": "report_writer",
"finalize": "finalize",
"done": END
}
)
builder.add_edge("web_searcher", "supervisor") # Loop back
builder.add_edge("report_writer", "supervisor") # Loop back
builder.add_conditional_edges(
"finalize",
approval_router,
{"finalize": "finalize", "done": END}
)
checkpointer = SqliteSaver.from_conn_string("./research_team.db")
app = builder.compile(checkpointer=checkpointer, interrupt_before=["finalize"])
To run the team:
config = {"configurable": {"thread_id": "research_session_1"}}
initial_state = {
"query": "What are the latest developments in local LLM inference optimization?",
"findings": [],
"draft_report": None,
"approved": None,
"next_action": "web_searcher",
"messages": []
}
for step in app.stream(initial_state, config=config, stream_mode="updates"):
print(f"Step: {list(step.keys())}")
# Pause at finalize. Human reviews. Resume:
app.invoke(Command(resume={"approved": True}), config=config)
The assembled project demonstrates checkpointed multi-agent loops, model-based routing, subgraph workers, human-in-the-loop interrupts, and structured streaming—all running against local Ollama models with no cloud dependency. Extend this architecture by adding additional specialized workers (code reviewer, fact-checker, citation generator), replacing SQLite with Postgres for multi-machine deployment, or wrapping the app in a FastAPI service with SSE streaming.
Run the research team with the query "What are the trade-offs between GGUF and EXL2 quantizations?". Confirm the supervisor loops between searcher and writer, the interrupt fires before finalize, and the human approval resumes correctly. Read journal_sqlite.db to inspect the stored checkpoint history after completion.