Agent Tracing — Multi-Agent Systems (Chapter 14)

Distributed tracing provides end-to-end visibility across agent interactions. Unlike simple logging, traces capture causality chains—showing exactly how one agent's output influenced another agent's behavior.

Trace Context Propagation

When an agent spawns a sub-agent or calls another agent, trace context must propagate. This creates a tree structure where each span represents an agent operation and edges represent causal relationships.

# tracing/context.py
from contextvars import ContextVar
from dataclasses import dataclass, field
from typing import Optional
import uuid

trace_id: ContextVar[str] = ContextVar('trace_id', default="")
span_id: ContextVar[str] = ContextVar('span_id', default="")
parent_span_id: ContextVar[str] = ContextVar('parent_span_id', default="")

@dataclass
class TraceContext:
    trace_id: str
    span_id: str
    parent_span_id: Optional[str]
    tags: dict = field(default_factory=dict)
    
    @classmethod
    def new(cls) -> 'TraceContext':
        tid = trace_id.get() or str(uuid.uuid4())
        current_span = span_id.get()
        
        return cls(
            trace_id=tid,
            span_id=str(uuid.uuid4()),
            parent_span_id=current_span or None
        )
    
    def inject(self) -> dict:
        return {
            "trace_id": self.trace_id,
            "span_id": self.span_id,
            "parent_span_id": self.parent_span_id
        }
    
    @classmethod
    def extract(cls, headers: dict) -> 'TraceContext':
        return cls(
            trace_id=headers.get("trace_id", str(uuid.uuid4())),
            span_id=str(uuid.uuid4()),
            parent_span_id=headers.get("span_id")
        )

class TracedAgent:
    def __init__(self, agent, exporter):
        self.agent = agent
        self.exporter = exporter
    
    def invoke(self, input_data: dict, context: TraceContext) -> dict:
        with self.exporter.start_span(context) as span:
            try:
                result = self.agent.invoke(input_data)
                span.set_attribute("status", "ok")
                return result
            except Exception as e:
                span.set_attribute("status", "error")
                span.set_attribute("error.message", str(e))
                raise
            finally:
                span.end()

Span Annotation

Traces become valuable when annotated with semantic information. Agents should annotate spans with tool invocations, token counts, and intermediate reasoning steps.

Trace Sampling

High-volume systems require intelligent sampling. The tracing infrastructure captures full traces for error cases while sampling baseline operations to control storage costs.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.