What this does

This guide implements distributed tracing across multiple AI agent services that collaborate on a single workflow. When one agent delegates a subtask to another, the trace context (trace ID, span ID, and trace flags) propagates via HTTP headers or message queue metadata. The result is a single end-to-end trace showing every agent's contribution, including inter-agent latency and error attribution.

Steps

Install the propagation library for the transport protocol. For HTTP:
```
pip install opentelemetry-propagator-b3
```
For message queues, use the W3C propagator included in opentelemetry-api.

Configure the global propagator in every agent service's startup:

from opentelemetry.propagate import set_global_textmap
from opentelemetry.propagators.composite import CompositeHTTPPropagator
from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator

set_global_textmap(CompositeHTTPPropagator([TraceContextTextMapPropagator()]))

In the orchestrator agent, create a parent span for the workflow and inject context into outbound HTTP calls:

tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("multi-agent-workflow") as workflow_span:
    headers = {}
    propagate.inject(headers)
    response = requests.post("http://worker-agent:8000/execute",
                             json={"task": subtask}, headers=headers)
    workflow_span.set_attribute("workflow.subtask_count", len(tasks))

In the worker agent, extract the propagated context and create child spans:

@app.post("/execute")
async def execute(request: Request):
    ctx = propagate.extract(dict(request.headers))
    tracer = trace.get_tracer(__name__)
    with tracer.start_as_current_span("worker-execute", context=ctx) as span:
        result = await process_task(request.json()["task"])
        span.set_attribute("worker.result_size", len(str(result)))
        return {"result": result}

For message-queue propagation, inject context into message metadata fields. With Redis Pub/Sub:

carrier = {}
propagate.inject(carrier)
redis.publish("agent-tasks", json.dumps({
    "task": task_data,
    "trace_context": carrier
}))

On the consumer side, extract from the message carrier and restore the parent-child span relationship.
Deploy all services with identical OTLP exporter configuration:
```
exporter = OTLPSpanExporter(endpoint="http://jaeger-collector:4317", insecure=True)
```
Expected: spans from both orchestrator and worker appear under one trace ID in the tracing UI.

Verification

curl -s "http://jaeger:16686/api/traces?service=agent-orchestrator&limit=1" | jq '.data[0].spans | length'

Expected output: an integer >= 2, confirming both orchestrator and worker spans exist in a single trace.

Common failures

Spans appear as separate traces — the propagated context is not being extracted. Verify the traceparent header is present on the worker's incoming request: print(dict(request.headers).get("traceparent")).
Incomplete traces (missing worker spans) — the worker's span exporter is not configured or the OTLP endpoint is unreachable from the worker container. Check worker logs for exporter errors.
Mismatched propagation formats — orchestrator uses W3C TraceContext but consumer expects B3 format. Standardize on W3C across all services by setting OTEL_PROPAGATORS=tracecontext.

How to implement distributed tracing for multi-agent workflows with trace context propagation

What this does

Steps

Verification

Common failures

Related guides