RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /LangGraph for Local Agents
  6. /Ch. 18
LangGraph for Local Agents

18. Multi-Agent Research Team Project

Chapter 18 of 18 · 20 min
KEY INSIGHT

This final architecture—a supervisor dispatching workers with persistent memory and human approval gates—is the production pattern for local multi-agent systems. Every component in the previous 17 chapters earns its place here.

This final chapter assembles everything from the previous 17 chapters into a single production-style multi-agent research team. The team architecture: a ResearchTeam with a model-based supervisor, two specialized workers (web_searcher and report_writer), persistent checkpointing to SQLite, human-in-the-loop approval before report finalization, and structured streaming output.

START → supervisor → [conditional routing]
                        ├── web_searcher → supervisor (loop)
                        ├── report_writer → supervisor (loop) → END
                        └── done → END

The supervisor uses an Ollama model to decide which worker to invoke next based on the shared state. Each worker is a ReAct subgraph with its own tools. Checkpointing to SQLite enables session resumption. The interrupt before finalization pauses for human approval.

from langgraph.graph import StateGraph, END, START
from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.prebuilt import create_react_agent
from langgraph.types import Command
from langchain_ollama import OllamaLLM
from typing import TypedDict, Literal

class ResearchState(TypedDict):
    query: str
    search findinds: Annotated[list[str], add]
    draft_report: str | None
    approved: bool | None
    next_action: Literal["web_searcher", "report_writer", "finalize", "done"]
    messages: Annotated[list[BaseMessage], add]

llm = OllamaLLM(model="llama3.2:latest", temperature=0)

# Workers
searcher = create_react_agent(
    model=llm,
    tools=[web_search, save_findings],
    state_schema=ResearchState
)
writer = create_react_agent(
    model=llm,
    tools=[write_file, read_file],
    state_schema=ResearchState
)

# Supervisor prompt
SUPERVISOR_PROMPT = """You are the research team supervisor.

Task: {query}
Completed findings: {findings}
Draft: {draft_report}

Decide next action:
- "web_searcher" if more research is needed
- "report_writer" if you have enough to draft or revise the report
- "finalize" if the report is complete and ready for review
- "done" if no more work is needed
"""

def supervisor_node(state: ResearchState) -> ResearchState:
    prompt = SUPERVISOR_PROMPT.format(
        query=state["query"],
        findings=state["findings"],
        draft_report=state["draft_report"] or "No draft yet"
    )
    response = llm.invoke(prompt)
    decision = parse_action(response)
    if decision not in {"web_searcher", "report_writer", "finalize", "done"}:
        decision = "done"
    return {"next_action": decision}

def approval_router(state: ResearchState) -> Literal["finalize", END]:
    return "finalize" if state.get("approved") else END

builder = StateGraph(ResearchState)
builder.add_node("supervisor", supervisor_node)
builder.add_node("web_searcher", searcher)
builder.add_node("report_writer", writer)
builder.add_node("finalize", finalize_node)

# Routing
builder.add_edge(START, "supervisor")
builder.add_conditional_edges(
    "supervisor",
    lambda s: s["next_action"],
    {
        "web_searcher": "web_searcher",
        "report_writer": "report_writer",
        "finalize": "finalize",
        "done": END
    }
)
builder.add_edge("web_searcher", "supervisor")  # Loop back
builder.add_edge("report_writer", "supervisor")  # Loop back
builder.add_conditional_edges(
    "finalize",
    approval_router,
    {"finalize": "finalize", "done": END}
)

checkpointer = SqliteSaver.from_conn_string("./research_team.db")
app = builder.compile(checkpointer=checkpointer, interrupt_before=["finalize"])

To run the team:

config = {"configurable": {"thread_id": "research_session_1"}}
initial_state = {
    "query": "What are the latest developments in local LLM inference optimization?",
    "findings": [],
    "draft_report": None,
    "approved": None,
    "next_action": "web_searcher",
    "messages": []
}

for step in app.stream(initial_state, config=config, stream_mode="updates"):
    print(f"Step: {list(step.keys())}")

# Pause at finalize. Human reviews. Resume:
app.invoke(Command(resume={"approved": True}), config=config)

The assembled project demonstrates checkpointed multi-agent loops, model-based routing, subgraph workers, human-in-the-loop interrupts, and structured streaming—all running against local Ollama models with no cloud dependency. Extend this architecture by adding additional specialized workers (code reviewer, fact-checker, citation generator), replacing SQLite with Postgres for multi-machine deployment, or wrapping the app in a FastAPI service with SSE streaming.

EXERCISE

Run the research team with the query "What are the trade-offs between GGUF and EXL2 quantizations?". Confirm the supervisor loops between searcher and writer, the interrupt fires before finalize, and the human approval resumes correctly. Read journal_sqlite.db to inspect the stored checkpoint history after completion.

← Chapter 17
LangGraph vs LangChain
Course complete →
Browse all courses