22. Research Agent Team
Research teams use multi-agent architectures to conduct thorough investigations. Agents specialize in different research phases—query formulation, source discovery, data extraction, synthesis, and citation management—coordinating to produce thorough, well-documented research outputs.
Specialized Roles
Query Architect: Refines research questions into optimal search strategies.
Source Navigator: Discovers and prioritizes relevant information sources.
Data Extractor: Pulls relevant facts and figures from documents.
Synthesis Engine: Combines extracted information into coherent insights.
Citation Manager: Formats references and maintains source attribution.
# research_team/team.py
from dataclasses import dataclass, field
from typing import Optional
from datetime import datetime
import asyncio
@dataclass
class ResearchQuery:
question: str
scope: list[str] # topics to cover
depth: str # "surface", "standard", "deep"
constraints: dict = field(default_factory=dict)
@dataclass
class Source:
url: str
title: str
credibility_score: float
retrieved_at: datetime
content_snippet: str
@dataclass
class ExtractedFact:
source: Source
claim: str
supporting_evidence: str
extracted_at: datetime
class ResearchOrchestrator:
def __init__(
self,
query_architect: any,
source_navigator: any,
data_extractor: any,
synthesis_engine: any,
citation_manager: any
):
self.query_architect = query_architect
self.source_navigator = source_navigator
self.data_extractor = data_extractor
self.synthesis_engine = synthesis_engine
self.citation_manager = citation_manager
async def conduct_research(self, query: ResearchQuery) -> dict:
refined_queries = await self.query_architect.refine(query)
all_sources = []
all_facts = []
for subquery in refined_queries:
sources = await self.source_navigator.find_sources(subquery)
all_sources.extend(sources)
for source in sources:
facts = await self.data_extractor.extract(source, query)
all_facts.extend(facts)
await self.citation_manager.index_sources(all_sources)
synthesis = await self.synthesis_engine.synthesize(
facts=all_facts,
query=query
)
citations = await self.citation_manager.format_citations(synthesis)
return {
"query": query.question,
"findings": synthesis,
"sources": [s.url for s in all_sources],
"citations": citations,
"confidence": self._calculate_confidence(all_sources)
}
def _calculate_confidence(self, sources: list[Source]) -> float:
if not sources:
return 0.0
avg_credibility = sum(s.credibility_score for s in sources) / len(sources)
diversity_bonus = min(0.1, len(sources) * 0.01)
return min(1.0, avg_credibility + diversity_bonus)
Parallel Research Streams
Independent research threads execute in parallel when queries don't overlap. The orchestrator merges results, deduplicating overlapping findings and combining insights from different angles.
Source Credibility Scoring
Sources receive credibility scores based on domain authority, publication recency, and citation patterns. High credibility sources influence synthesis weighting.
Implement a consensus detection system that identifies findings supported by multiple independent sources and flags contradictory claims for further investigation.