Skip to content

RFC: Native OpenTelemetry representation for Agent Trace #6

@gsiener

Description

@gsiener

Summary

Agent Trace should be built on OpenTelemetry. OTel is the industry standard for observability telemetry, and code attribution is observability data. Defining Agent Trace using OTel's native primitives—spans, events, and attributes—means direct export to any OTel-compatible backend with no translation layers required.

Why OTel?

OpenTelemetry is where the industry has converged. Cloudflare (a contributor to this spec) supports OTel-native tracing. Honeycomb, Datadog, LangChain, and others all speak OTel natively. The GenAI semantic conventions are stabilizing, and there's active work on agentic systems conventions.

Building on OTel gives us:

  • Zero integration cost: Any OTel collector, SDK, or backend works out of the box
  • No translation layers: Data flows directly from agent → collector → backend
  • Correlation for free: Code attribution spans naturally join with LLM invocation spans, tool calls, and other GenAI telemetry
  • Mature tooling: Query, visualize, and alert using battle-tested observability infrastructure

Proposal

Agent Trace as OTel Spans + Attributes

A trace record becomes a set of OTel spans with well-defined semantic attributes:

Trace: code-attribution session
└── Span: file_attribution (src/utils/parser.ts)
    ├── Attributes:
    │   ├── agent_trace.version: "0.1"
    │   ├── agent_trace.file.path: "src/utils/parser.ts"
    │   ├── agent_trace.vcs.type: "git"
    │   ├── agent_trace.vcs.revision: "a1b2c3d..."
    │   ├── gen_ai.request.model: "anthropic/claude-opus-4-5-20251101"
    │   ├── gen_ai.provider.name: "anthropic"
    │   ├── gen_ai.conversation.id: "conv-12345"
    │   └── gen_ai.agent.name: "cursor"
    │
    └── Events:
        ├── agent_trace.range {start_line: 42, end_line: 67, contributor_type: "ai", content_hash: "murmur3:9f2e8a1b"}
        ├── agent_trace.range {start_line: 80, end_line: 95, contributor_type: "ai"}
        └── agent_trace.related {type: "session", url: "https://..."}

Semantic Attributes

Extend the gen_ai.* namespace or define agent_trace.* for code-attribution-specific concerns:

Attribute Type Description
agent_trace.version string Spec version
agent_trace.file.path string Relative path from repo root
agent_trace.vcs.type enum git, jj, hg, svn
agent_trace.vcs.revision string Commit SHA / change ID
agent_trace.range.start_line int 1-indexed start line
agent_trace.range.end_line int 1-indexed end line
agent_trace.range.content_hash string Position-independent tracking
agent_trace.contributor.type enum human, ai, mixed, unknown

Reuse existing GenAI conventions where they apply:

  • gen_ai.request.model for model identification
  • gen_ai.provider.name for the AI provider
  • gen_ai.conversation.id for conversation tracking
  • gen_ai.agent.name / gen_ai.agent.id for tool identification

JSON as a Serialization Format

The current JSON schema can remain as a human-readable serialization (useful for .agent-trace files in repos), but the canonical representation is OTel. The JSON format is just one way to serialize OTel data for local storage.

What This Enables

Direct pipeline integration:

Agent (Cursor / Claude Code / etc.)
    ↓ OTLP
OTel Collector
    ↓ OTLP
Honeycomb / Cloudflare / LangChain / etc.

Unified queries: "Show me all code attributed to Claude Opus 4.5 this week where token cost exceeded $1"—one query joining attribution and LLM telemetry.

Full traceability: Click on an expensive LLM call → see exactly which lines of code it produced. Click on attributed code → see the full trace of the AI interaction that generated it.

Discussion

OTel feels like the right foundation for Agent Trace. We propose moving in this direction and want to hear if there are considerations we've missed:

  1. What would break? Are there use cases where OTel's span/event model falls short for code attribution?
  2. What are we missing? Are there code attribution concerns that don't map cleanly to OTel primitives?
  3. Who's already doing this? If you're instrumenting AI coding tools with OTel today, what's working and what isn't?

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions