Research Agent Team — Multi-Agent Systems (Chapter 22)

Research teams use multi-agent architectures to conduct thorough investigations. Agents specialize in different research phases—query formulation, source discovery, data extraction, synthesis, and citation management—coordinating to produce thorough, well-documented research outputs.

Specialized Roles

Query Architect: Refines research questions into optimal search strategies.

Source Navigator: Discovers and prioritizes relevant information sources.

Data Extractor: Pulls relevant facts and figures from documents.

Synthesis Engine: Combines extracted information into coherent insights.

Citation Manager: Formats references and maintains source attribution.

# research_team/team.py
from dataclasses import dataclass, field
from typing import Optional
from datetime import datetime
import asyncio

@dataclass
class ResearchQuery:
    question: str
    scope: list[str]  # topics to cover
    depth: str  # "surface", "standard", "deep"
    constraints: dict = field(default_factory=dict)

@dataclass
class Source:
    url: str
    title: str
    credibility_score: float
    retrieved_at: datetime
    content_snippet: str

@dataclass
class ExtractedFact:
    source: Source
    claim: str
    supporting_evidence: str
    extracted_at: datetime

class ResearchOrchestrator:
    def __init__(
        self,
        query_architect: any,
        source_navigator: any,
        data_extractor: any,
        synthesis_engine: any,
        citation_manager: any
    ):
        self.query_architect = query_architect
        self.source_navigator = source_navigator
        self.data_extractor = data_extractor
        self.synthesis_engine = synthesis_engine
        self.citation_manager = citation_manager
    
    async def conduct_research(self, query: ResearchQuery) -> dict:
        refined_queries = await self.query_architect.refine(query)
        
        all_sources = []
        all_facts = []
        
        for subquery in refined_queries:
            sources = await self.source_navigator.find_sources(subquery)
            all_sources.extend(sources)
            
            for source in sources:
                facts = await self.data_extractor.extract(source, query)
                all_facts.extend(facts)
        
        await self.citation_manager.index_sources(all_sources)
        
        synthesis = await self.synthesis_engine.synthesize(
            facts=all_facts,
            query=query
        )
        
        citations = await self.citation_manager.format_citations(synthesis)
        
        return {
            "query": query.question,
            "findings": synthesis,
            "sources": [s.url for s in all_sources],
            "citations": citations,
            "confidence": self._calculate_confidence(all_sources)
        }
    
    def _calculate_confidence(self, sources: list[Source]) -> float:
        if not sources:
            return 0.0
        avg_credibility = sum(s.credibility_score for s in sources) / len(sources)
        diversity_bonus = min(0.1, len(sources) * 0.01)
        return min(1.0, avg_credibility + diversity_bonus)

Parallel Research Streams

Independent research threads execute in parallel when queries don't overlap. The orchestrator merges results, deduplicating overlapping findings and combining insights from different angles.

Source Credibility Scoring

Sources receive credibility scores based on domain authority, publication recency, and citation patterns. High credibility sources influence synthesis weighting.