Skip to content

Conversation

@vasantteja
Copy link
Contributor

@vasantteja vasantteja commented Dec 16, 2025

Description

Implements OpenTelemetry instrumentation for Anthropic's Messages.create API (sync, non-streaming). This adds automatic tracing for Anthropic SDK calls, capturing GenAI semantic convention attributes for observability.

P.S: LLM help was used for writing the code.

What's Implemented

  • patch.py: Wrapper function for Messages.create that creates spans with request/response attributes
  • utils.py: Helper functions for attribute extraction, error handling, and content capture configuration
  • __init__.py: Wiring for patching/unpatching via wrapt
  • Tests: Comprehensive test suite with VCR cassettes for API mocking
  • Updated minimum anthropic version to 0.16.0 - the instrumentation requires the modern SDK structure (anthropic.resources.messages module) which doesn't exist in versions prior to 0.18.0.

Semantic Convention Attributes Captured

Request:

  • gen_ai.operation.name (chat)
  • gen_ai.system (anthropic)
  • gen_ai.request.model
  • gen_ai.request.max_tokens
  • gen_ai.request.temperature
  • gen_ai.request.top_p
  • gen_ai.request.top_k
  • gen_ai.request.stop_sequences
  • server.address

Response:

  • gen_ai.response.id
  • gen_ai.response.model
  • gen_ai.response.finish_reasons
  • gen_ai.usage.input_tokens
  • gen_ai.usage.output_tokens

Error:

  • error.type (on exceptions)

Fixes #(#3949)

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Ran the full test suite with VCR cassettes for mocked API responses:

pytest instrumentation-genai/opentelemetry-instrumentation-anthropic/tests/ -v

Test cases:

  • Basic sync message creation with correct span attributes
  • All optional parameters (temperature, top_p, top_k, stop_sequences)
  • Token usage capture (input/output tokens)
  • Stop reason captured as finish_reasons array
  • Connection error handling (APIConnectionError)
  • API error handling (404 NotFoundError)
  • Uninstrumentation removes patching
  • Multiple instrument/uninstrument cycles

Results: 15 tests passed (8 new + 7 existing instrumentor tests)

Does This PR Require a Core Repo Change?

  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@vasantteja vasantteja removed their assignment Dec 16, 2025
@vasantteja vasantteja removed their assignment Dec 16, 2025
@vasantteja vasantteja removed their assignment Dec 16, 2025
@vasantteja vasantteja removed their assignment Dec 16, 2025
@vasantteja vasantteja removed their assignment Dec 16, 2025
@vasantteja vasantteja removed their assignment Dec 16, 2025
@vasantteja vasantteja removed their assignment Dec 16, 2025
@vasantteja vasantteja closed this Dec 16, 2025
@vasantteja vasantteja reopened this Dec 17, 2025
@vasantteja vasantteja removed their assignment Dec 17, 2025
Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions, but looks pretty good, thanks

Comment on lines +85 to +92
GenAIAttributes.GEN_AI_REQUEST_MODEL: kwargs.get("model"),
GenAIAttributes.GEN_AI_REQUEST_MAX_TOKENS: kwargs.get("max_tokens"),
GenAIAttributes.GEN_AI_REQUEST_TEMPERATURE: kwargs.get("temperature"),
GenAIAttributes.GEN_AI_REQUEST_TOP_P: kwargs.get("top_p"),
GenAIAttributes.GEN_AI_REQUEST_TOP_K: kwargs.get("top_k"),
GenAIAttributes.GEN_AI_REQUEST_STOP_SEQUENCES: kwargs.get(
"stop_sequences"
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For extracting parameters from the untyped kwargs dict, can you define a function with copied method signature from the Anthropic code being instrumented (and add a link to the code in a comment) and call it with *args, **kwargs? IMO it helps the reader and will let the type checker know the expected types

Example from Vertex:

return GenerateContentParams(
model=request.model,
contents=request.contents,


def get_llm_request_attributes(
kwargs: dict[str, Any], client_instance: Any
) -> dict[str, Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you annotate the return type like this?

Comment on lines +43 to +63
span_attributes = {**get_llm_request_attributes(kwargs, instance)}

span_name = f"{span_attributes[GenAIAttributes.GEN_AI_OPERATION_NAME]} {span_attributes[GenAIAttributes.GEN_AI_REQUEST_MODEL]}"
with tracer.start_as_current_span(
name=span_name,
kind=SpanKind.CLIENT,
attributes=span_attributes,
end_on_exit=False,
) as span:
try:
result = wrapped(*args, **kwargs)

if span.is_recording():
_set_response_attributes(span, result)

span.end()
return result

except Exception as error:
handle_span_exception(span, error)
raise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keith-decker can TelemetryHandler cover all of the common code in this PR? This seems like a great PR to start integrating GenAI util with

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants