Skip to content

Conversation

@geffzhang
Copy link
Collaborator

@geffzhang geffzhang commented Oct 17, 2025

User description

Introduced OpenTelemetry-based model diagnostics with Langfuse integration, including new helper classes and activity tracing for agent and function execution. Added BotSharp.Plugin.GiteeAI with chat and embedding providers, and updated solution/project files to register the new plugin. Enhanced tracing in routing, executor, and controller logic for improved observability.


PR Type

Enhancement, Documentation


Description

  • Implemented OpenTelemetry-based model diagnostics with Langfuse integration for improved observability

  • Added GiteeAI plugin with chat completion and text embedding providers

  • Enhanced tracing in routing, executor, and controller logic for agent and function execution

  • Integrated activity tracking with semantic conventions for GenAI operations


Diagram Walkthrough

flowchart LR
  A["OpenTelemetry Setup"] --> B["ModelDiagnostics Helper"]
  B --> C["Activity Tracing"]
  C --> D["Chat Providers"]
  C --> E["Function Executors"]
  C --> F["Routing Service"]
  G["GiteeAI Plugin"] --> D
  H["Langfuse Config"] --> A
Loading

File Walkthrough

Relevant files
Configuration changes
6 files
Extensions.cs
Configure OpenTelemetry with Langfuse exporter                     
+49/-3   
BotSharp.Plugin.GiteeAI.csproj
Create GiteeAI plugin project file                                             
+31/-0   
BotSharp.sln
Register GiteeAI plugin in solution                                           
+11/-0   
WebStarter.csproj
Add GiteeAI plugin project reference                                         
+1/-0     
appsettings.json
Add Langfuse and GiteeAI model configurations                       
+85/-40 
Program.cs
Comment out MCP service configuration                                       
+2/-2     
Enhancement
17 files
LangfuseSettings.cs
Add Langfuse configuration settings model                               
+19/-0   
ModelDiagnostics.cs
Implement model diagnostics with semantic conventions       
+394/-0 
ActivityExtensions.cs
Add activity extension methods for tracing                             
+119/-0 
AppContextSwitchHelper.cs
Helper to read app context switch values                                 
+35/-0   
FunctionCallbackExecutor.cs
Add activity tracing to function execution                             
+16/-2   
MCPToolExecutor.cs
Add activity tracing to MCP tool execution                             
+34/-17 
RoutingService.InvokeAgent.cs
Add agent invocation activity tracing                                       
+4/-1     
RoutingService.InvokeFunction.cs
Import diagnostics for function invocation                             
+1/-0     
RoutingService.cs
Add System.Diagnostics import for tracing                               
+1/-0     
ConversationController.cs
Add activity tracing to conversation endpoints                     
+35/-19 
ChatCompletionProvider.cs
Integrate model diagnostics into chat completion                 
+78/-66 
ChatCompletionProvider.cs
Integrate model diagnostics into chat completion                 
+65/-52 
GiteeAiPlugin.cs
Create GiteeAI plugin with DI registration                             
+19/-0   
ChatCompletionProvider.cs
Implement GiteeAI chat completion provider                             
+496/-0 
TextEmbeddingProvider.cs
Implement GiteeAI text embedding provider                               
+73/-0   
ProviderHelper.cs
Add helper to create GiteeAI client instances                       
+16/-0   
Using.cs
Add global using statements for GiteeAI plugin                     
+15/-0   
Documentation
1 files
README.md
Add GiteeAI plugin documentation                                                 
+8/-0     
Formatting
1 files
Program.cs
Reorder using statements for clarity                                         
+2/-3     

geffzhang and others added 4 commits October 9, 2025 06:42
System.Diagnostics combined with Microsoft.Extensions.Telemetry integrated into Langfuse via OpenTelemetry (OTEL)
Removed the BotSharp.Langfuse project and related files, migrating LangfuseSettings to BotSharp.ServiceDefaults. Added the BotSharp.Plugin.GiteeAI plugin with chat and embedding providers. Enhanced OpenTelemetry integration with Langfuse support and improved diagnostics tagging in core executors and controllers. Updated solution and project files to reflect these changes.
Replaces previous GPT-4.1 model entries with updated GPT-3.5-turbo and gpt-35-turbo-instruct configurations, including new endpoints, versioning, and cost structure. Sensitive API keys have been removed from the configuration.
Introduced OpenTelemetry-based model diagnostics with Langfuse integration, including new helper classes and activity tracing for agent and function execution. Added BotSharp.Plugin.GiteeAI with chat and embedding providers, and updated solution/project files to register the new plugin. Enhanced tracing in routing, executor, and controller logic for improved observability.
@qodo-code-review
Copy link

qodo-code-review bot commented Oct 17, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Sensitive data in traces

Description: Diagnostic switches enable sensitive events which may capture and export user
prompts/messages and agent inputs/outputs in traces, risking PII exposure if telemetry
backend is not properly secured.
Extensions.cs [52-55]

Referred Code
// Enable model diagnostics with sensitive data.
AppContext.SetSwitch("BotSharp.Experimental.GenAI.EnableOTelDiagnostics", true);
AppContext.SetSwitch("BotSharp.Experimental.GenAI.EnableOTelDiagnosticsSensitive", true);
Insecure auth transport

Description: Constructs Basic Auth header for OTLP exporter to Langfuse and sends it over HTTP
Protobuf; if the host is misconfigured to non-TLS, credentials could be exposed.
Extensions.cs [139-151]

Referred Code
var publicKey = langfuseSection.GetValue<string>(nameof(LangfuseSettings.PublicKey)) ?? string.Empty;
var secretKey = langfuseSection.GetValue<string>(nameof(LangfuseSettings.SecretKey)) ?? string.Empty;
var host = langfuseSection.GetValue<string>(nameof(LangfuseSettings.Host)) ?? string.Empty;
var plainTextBytes = System.Text.Encoding.UTF8.GetBytes($"{publicKey}:{secretKey}");
string base64EncodedAuth = Convert.ToBase64String(plainTextBytes);

builder.Services.ConfigureOpenTelemetryTracerProvider(tracing => tracing.AddOtlpExporter(options =>
{
    options.Endpoint = new Uri(host);
    options.Protocol = OtlpExportProtocol.HttpProtobuf;
    options.Headers = $"Authorization=Basic {base64EncodedAuth}";
})
);
Ticket Compliance
🎫 No ticket provided
- [ ] Create ticket/issue <!-- /create_ticket --create_ticket=true -->

</details></td></tr>
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
No custom compliance provided

Follow the guide to enable custom compliance check.

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

qodo-code-review bot commented Oct 17, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Refactor providers to reduce duplication

The ChatCompletionProvider for the new GiteeAI plugin duplicates code from the
OpenAI and AzureOpenAI providers. To improve maintainability, refactor this by
creating a shared base class for providers with OpenAI-compatible APIs to house
the common logic.

Examples:

src/Plugins/BotSharp.Plugin.GiteeAI/Providers/Chat/ChatCompletionProvider.cs [15-100]
public class ChatCompletionProvider(
    ILogger<ChatCompletionProvider> logger,
    IServiceProvider services) : IChatCompletion
{
    protected string _model = string.Empty;

    public virtual string Provider => "gitee-ai";

    public string Model => _model;


 ... (clipped 76 lines)
src/Plugins/BotSharp.Plugin.OpenAI/Providers/Chat/ChatCompletionProvider.cs [35-119]
    public async Task<RoleDialogModel> GetChatCompletions(Agent agent, List<RoleDialogModel> conversations)
    {
        var contentHooks = _services.GetHooks<IContentGeneratingHook>(agent.Id);
        var convService = _services.GetService<IConversationStateService>();

        // Before chat completion hook
        foreach (var hook in contentHooks)
        {
            await hook.BeforeGenerating(agent, conversations);
        }

 ... (clipped 75 lines)

Solution Walkthrough:

Before:

// In GiteeAI/ChatCompletionProvider.cs
public class ChatCompletionProvider : IChatCompletion
{
    public async Task<RoleDialogModel> GetChatCompletions(...)
    {
        // ... setup ...
        var (prompt, messages, options) = PrepareOptions(agent, conversations);
        using (var activity = ModelDiagnostics.StartCompletionActivity(...))
        {
            // ... call API, process response, set tags ...
        }
    }
    protected (string, ...) PrepareOptions(...) { /* ... complex logic ... */ }
}

// In OpenAI/ChatCompletionProvider.cs
public class ChatCompletionProvider : IChatCompletion
{
    // ... Nearly identical implementation to GiteeAI ...
}

After:

// New Base Class
public abstract class OpenAiCompatibleChatCompletionProvider : IChatCompletion
{
    public abstract string Provider { get; }

    public async Task<RoleDialogModel> GetChatCompletions(Agent agent, List<RoleDialogModel> conversations)
    {
        // ... common setup ...
        var (prompt, messages, options) = PrepareOptions(agent, conversations);
        using (var activity = ModelDiagnostics.StartCompletionActivity(..., Provider, ...))
        {
            // ... common logic for API call, response processing, tags ...
        }
    }

    protected virtual (string, ...) PrepareOptions(...) { /* ... common complex logic ... */ }
}

// Refactored GiteeAI provider
public class GiteeAiChatCompletionProvider : OpenAiCompatibleChatCompletionProvider
{
    public override string Provider => "gitee-ai";
    // ... minimal overrides if needed ...
}
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies significant code duplication across the new GiteeAI provider and the existing OpenAI and AzureOpenAI providers, which this PR exacerbates, and proposes a valid refactoring that would greatly improve maintainability.

High
Possible issue
Fix incorrect null check

Replace the incorrect null check on langfuseSection with
langfuseSection.Exists() to correctly determine if the configuration section is
present.

src/BotSharp.ServiceDefaults/Extensions.cs [129-130]

 var langfuseSection = builder.Configuration.GetSection("Langfuse");
-var useLangfuse = langfuseSection != null;
+var useLangfuse = langfuseSection.Exists();
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies that GetSection never returns null, fixing a bug where useLangfuse would always be true, which would lead to a UriFormatException if the configuration section is missing.

Medium
Validate Langfuse configuration values

Add validation to ensure publicKey, secretKey, and host from the Langfuse
configuration are not empty before attempting to create and use them for
authentication.

src/BotSharp.ServiceDefaults/Extensions.cs [139-143]

 var publicKey = langfuseSection.GetValue<string>(nameof(LangfuseSettings.PublicKey)) ?? string.Empty;
 var secretKey = langfuseSection.GetValue<string>(nameof(LangfuseSettings.SecretKey)) ?? string.Empty;
 var host = langfuseSection.GetValue<string>(nameof(LangfuseSettings.Host)) ?? string.Empty;
+
+if (string.IsNullOrEmpty(publicKey) || string.IsNullOrEmpty(secretKey) || string.IsNullOrEmpty(host))
+{
+    return builder;
+}
+
 var plainTextBytes = System.Text.Encoding.UTF8.GetBytes($"{publicKey}:{secretKey}");
 string base64EncodedAuth = Convert.ToBase64String(plainTextBytes);
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out that missing configuration values will lead to a UriFormatException and adds necessary validation to prevent this runtime error, improving the code's robustness.

Medium
General
Remove duplicate tag assignment
Suggestion Impact:The commit removed the duplicated assignment of ModelDiagnosticsTags.OutputTokens, leaving a single SetTag call.

code diff:

             activity?.SetTag(ModelDiagnosticsTags.InputTokens, (tokenUsage?.InputTokenCount ?? 0) - (inputTokenDetails?.CachedTokenCount ?? 0));
-            activity?.SetTag(ModelDiagnosticsTags.OutputTokens, tokenUsage?.OutputTokenCount ?? 0);
-            activity?.SetTag(ModelDiagnosticsTags.OutputTokens, tokenUsage?.OutputTokenCount ?? 0);
+            activity?.SetTag(ModelDiagnosticsTags.OutputTokens, tokenUsage?.OutputTokenCount ?? 0); 
            

Remove the duplicate line that sets the OutputTokens tag.

src/Plugins/BotSharp.Plugin.AzureOpenAI/Providers/Chat/ChatCompletionProvider.cs [128-130]

 activity?.SetTag(ModelDiagnosticsTags.InputTokens, (tokenUsage?.InputTokenCount ?? 0) - (inputTokenDetails?.CachedTokenCount ?? 0));
 activity?.SetTag(ModelDiagnosticsTags.OutputTokens, tokenUsage?.OutputTokenCount ?? 0);
-activity?.SetTag(ModelDiagnosticsTags.OutputTokens, tokenUsage?.OutputTokenCount ?? 0);

[Suggestion processed]

Suggestion importance[1-10]: 3

__

Why: The suggestion correctly identifies a duplicated line of code that sets the OutputTokens tag twice and proposes removing the redundant line, which cleans up the code.

Low
Remove unnecessary semicolon
Suggestion Impact:The stray semicolon following the foreach block was removed in the commit.

code diff:

@@ -42,8 +57,6 @@
         {
             activity.SetTag(tag.Key, tag.Value);
         }
-        ;
-
         return activity;

Remove the unnecessary semicolon after the foreach loop's closing brace.

src/Infrastructure/BotSharp.Abstraction/Diagnostics/ActivityExtensions.cs [43-45]

 activity.SetTag(tag.Key, tag.Value);
 }
-;
 
 return activity;

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 2

__

Why: The suggestion correctly identifies and removes a stray semicolon, which is a minor code style and readability improvement.

Low
  • Update

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this class be moved to the implementation project of BotSharp.Plugin.GiteeAI?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class is otel semantic conventions https://github.com/open-telemetry/semantic-conventions

@geffzhang
Copy link
Collaborator Author

This is the result of integrating OpenTelemetry with Langfuse.
The integration enables real-time tracing and streaming collection of inputs, outputs, execution states, and exceptions for each node in complex workflows, enhancing observability for debugging and performance optimizationbb711bef218bda393195523105b5ecf0

@iceljc
Copy link
Collaborator

iceljc commented Nov 6, 2025

This is the result of integrating OpenTelemetry with Langfuse. The integration enables real-time tracing and streaming collection of inputs, outputs, execution states, and exceptions for each node in complex workflows, enhancing observability for debugging and performance optimizationbb711bef218bda393195523105b5ecf0

Is there any setting to turn on/off the OpenTelemetry with Langfuse?

@geffzhang
Copy link
Collaborator Author

This is the result of integrating OpenTelemetry with Langfuse. The integration enables real-time tracing and streaming collection of inputs, outputs, execution states, and exceptions for each node in complex workflows, enhancing observability for debugging and performance optimizationbb711bef218bda393195523105b5ecf0

Is there any setting to turn on/off the OpenTelemetry with Langfuse?
I will refactor this part of the code.

@geffzhang geffzhang assigned Copilot and unassigned yileicn and Copilot Nov 22, 2025
@geffzhang
Copy link
Collaborator Author

geffzhang commented Nov 22, 2025

High-Level Objectives

  1. Introduce generalized OpenTelemetry (OTel) model diagnostics with semantic conventions for GenAI operations.
  2. Integrate Langfuse tracing (refactoring away from the prior dedicated BotSharp.Diagnostics.Langfuse project into a unified ModelDiagnostics approach).
  3. Add a new plugin: BotSharp.Plugin.GiteeAI (chat completion + text embedding providers).
  4. Extend tracing coverage across:
    • ConversationController (incoming requests)
    • RoutingService (agent selection and function dispatch)
    • FunctionCallbackExecutor and MCPToolExecutor (tool/function execution)
    • Model providers (Azure OpenAI, OpenAI, and newly added GiteeAI) with activity spans for prompt/response lifecycles.
  5. Enhance configuration (appsettings.json) to include Langfuse and GiteeAI model entries.
  6. Provide initial plugin documentation in README.md.
  7. Prepare architecture diagrams and walkthrough to clarify the tracing and provider orchestration flow.

Architectural Evolution

Previously, Langfuse integration appears to have been isolated (e.g., via a specialized diagnostics project). This PR consolidates diagnostic responsibilities into:

  • ModelDiagnostics.cs: A single, extensible instrumentation layer
  • ActivityExtensions.cs: Shared helpers for span enrichment (e.g., tags for model name, input length, output length, latency, tool invocation context)
    The removal/refactoring of BotSharp.Diagnostics.Langfuse (as indicated in your request) results in a cleaner, provider-agnostic tracing path.

Core Added / Modified Components

Diagnostics & Tracing Foundation

  • ActivityExtensions.cs (≈119 additions): Adds extension methods to attach standardized attributes (e.g., genai.system, genai.model, usage tokens, error info).
  • AppContextSwitchHelper.cs: Utility for reading runtime feature toggles tied to diagnostics.
  • Extensions.cs: Registers OpenTelemetry, sets up exporters (Langfuse), and configures ActivitySource(s) to emit spans.

Routing & Execution Instrumentation

  • RoutingService (partial modifications in RoutingService.cs, InvokeAgent.cs, InvokeFunction.cs): Injects spans around agent resolution, function decision logic, and fallback/branching behaviors.
  • FunctionCallbackExecutor.cs / MCPToolExecutor.cs: Wrap function/micro-tool execution inside activities, capturing input/output payload summaries and potential exception traces.

Controller Layer

  • ConversationController.cs: Starts top-level request activities (entry points), enabling correlation IDs and trace continuity across chained model/tool operations.

Provider Enhancements

Instrumentation added (or refactored) to:

  • ChatCompletionProvider.cs (multiple instances for different providers; diffs show added tracing blocks wrapping send/receive phases)
  • GiteeAI ChatCompletionProvider.cs: Full new implementation (≈496 lines) with prompt assembly, response parsing, diagnostic events.
  • TextEmbeddingProvider.cs (GiteeAI): Embedding generation with trace spans for vectorization operations.
  • ProviderHelper.cs: Centralizes client instantiation (e.g., constructing a configured HTTP client for GiteeAI with retries or headers).

GiteeAI Plugin Infrastructure

  • BotSharp.Plugin.GiteeAI.csproj: New project definition.
  • GiteeAiPlugin.cs: Dependency injection (DI) registration of chat and embedding providers.
  • Using.cs: Global using directives for plugin namespace cohesion.
  • BotSharp.sln / WebStarter.csproj: Solution and startup integration of the plugin.

Configuration & Settings

  • appsettings.json: Adds Langfuse credentials/config blocks and GiteeAI model definitions (plus adjustments for existing OpenAI/Azure OpenAI entries). Expansions suggest enabling multi-provider side-by-side tracing comparisons.

Documentation

  • README.md: Introduces usage/registration details for the GiteeAI plugin (likely minimal initial documentation, flagged for future expansion).

Formatting / Minor Adjustments

  • Program.cs: Minor cleanup (reordered usings, commented MCP service block temporarily).

Mermaid Diagram Walkthrough

1. High-Level Data & Trace Flow

flowchart LR
  Request["HTTP Request /conversation"] --> CC["ConversationController"]
  CC --> RT["RoutingService"]
  RT --> AGT["Agent Selection"]
  AGT --> DEC{"Dispatch"}
  DEC --> CHAT["ChatCompletion Provider (Azure/OpenAI/GiteeAI)"]
  DEC --> FUNC["FunctionCallbackExecutor"]
  DEC --> MCP["MCPToolExecutor"]
  CHAT --> DIAG["ModelDiagnostics"]
  FUNC --> DIAG
  MCP --> DIAG
  DIAG --> OTEL["OpenTelemetry ActivitySource"]
  OTEL --> LANG["Langfuse Exporter"]
  OTEL --> OTHER["Other OTEL Exporters (e.g., Console, OTLP)"]
Loading

2. Detailed Sequence (Conversation -> Agent -> Provider -> Trace)

sequenceDiagram
  participant User
  participant Controller as ConversationController
  participant Router as RoutingService
  participant Agent as Agent/Policy
  participant Provider as Model Provider (Azure/OpenAI/GiteeAI)
  participant Exec as Function/MCP Executor
  participant Diag as ModelDiagnostics
  participant OTel as OpenTelemetry
  participant Lang as Langfuse

  User->>Controller: Chat/Task request
  Controller->>Diag: Start root Activity (request span)
  Controller->>Router: Route(message, context)
  Router->>Agent: Evaluate agent(s)
  Agent-->>Router: Selected agent + strategy
  Router->>Diag: Start agent invocation span
  Router->>Provider: Generate response (chat or embedding)
  Provider->>Diag: Start model operation span
  Provider-->>Diag: Emit attributes (model, tokens, latency, etc.)
  Diag->>OTel: Activity emission
  OTel->>Lang: Export spans (Langfuse sink)
  alt Needs tool/function
    Router->>Exec: Invoke function/MCP tool
    Exec->>Diag: Start tool execution span
    Exec-->>Diag: Output / error metadata
  end
  Provider-->>Router: Response
  Router-->>Controller: Aggregated result
  Controller-->>User: Final structured reply
Loading

File Walkthrough (Categorized)

Configuration & Bootstrapping

  • Extensions.cs: Central OTel + Langfuse registration logic
  • appsettings.json: Adds Langfuse settings block and multi-model provider configuration (Azure OpenAI, OpenAI, GiteeAI)
  • Program.cs: Minor startup adjustments (commented MCP service portion)
  • BotSharp.sln / WebStarter.csproj: Solution integration of GiteeAI plugin project

Diagnostics Core

  • ModelDiagnostics.cs: Heart of instrumentation (chat, embedding, function, agent spans)
  • ActivityExtensions.cs: Tagging helpers (model name, message counts, timings, error classification)
  • AppContextSwitchHelper.cs: Toggle-based feature enabling (e.g., conditional tracing intensity)

Routing & Execution

  • RoutingService.cs / InvokeAgent.cs / InvokeFunction.cs: Introduces or expands span boundaries around decision phases
  • FunctionCallbackExecutor.cs / MCPToolExecutor.cs: Wraps function/micro-tool invocation with structured activities

Providers

  • ChatCompletionProvider.cs (multiple provider contexts including Azure/OpenAI): Inject span start/stop, response metrics
  • GiteeAI:
    • ChatCompletionProvider.cs: New provider with full instrumentation
    • TextEmbeddingProvider.cs: Embedding generation + trace capture
    • ProviderHelper.cs: Client construction utilities (e.g., auth/config composition)
    • Using.cs & GiteeAiPlugin.cs: Plugin-level DI and global usings

Documentation

  • README.md: Adds initial GiteeAI plugin usage or description

Miscellaneous

  • Minor formatting (Program.cs) and import adjustments (System.Diagnostics added to several files)

Observability & Semantic Conventions

The tracing layer appears to align with emerging GenAI semantic conventions:

  • genai.model / genai.system: Identify provider & runtime
  • Usage metrics: token counts, input/output sizes
  • Latency: operation timing
  • Error classification: enabling downstream analytics
    This positions the system for cross-provider performance and reliability comparison, while Langfuse augments prompt/result auditing.

Commit-Level Summary (Conceptual)

While individual commit messages were not visible in the retrieved data, the 26 commits likely progressed through:

  1. Scaffolding diagnostics infrastructure (ActivitySources, settings).
  2. Introducing LangfuseSettings and configuration wiring.
  3. Instrumenting existing providers (Azure/OpenAI).
  4. Adding GiteeAI plugin project + chat provider + embedding provider.
  5. Enhancing routing and execution spans.
  6. Refactoring/removing prior Langfuse-specific project (unifying under ModelDiagnostics).
  7. Expanding appsettings.json for multi-provider support.
  8. Adding README plugin documentation.
  9. Final polish: minor formatting and controller tracing refinements.

Key Benefits

  • Unified tracing across heterogeneous model and tool operations.
  • Pluggable multi-model architecture (Azure OpenAI, OpenAI, GiteeAI) with consistent diagnostics.
  • Improved debuggability (span granularity at agent, function, model levels).
  • Extensibility: Future plugins can hook into ModelDiagnostics without bespoke exporter code.
  • Langfuse integration supports prompt/response observability and post-hoc analysis.

<PackageVersion Include="NCrontab" Version="3.3.3" />
<PackageVersion Include="Azure.AI.OpenAI" Version="2.5.0-beta.1" />
<PackageVersion Include="OpenAI" Version="2.6.0" />
<PackageVersion Include="OpenAI" Version="2.5.0" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenAI怎么降级了?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.6.0 和 Azure.AI.OpenAI 2.5.0 冲突,有序列化 bug,要等Azure.AI.OpenAI 修复 53986

RoleDialogModel responseMessage;

if (response.StopReason == "tool_use")
using (var activity = _telemetryService.StartCompletionActivity(null, _model, Provider, conversations, convService))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果不打开tracing,这块会影响执行效率吗

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不会对执行效率产生显著影响。即使 tracing 没有打开,OpenTelemetry 的设计也能保证很低的性能开销。
当 tracing 被禁用时,OpenTelemetry 会返回 "空活动" 或 "无操作活动":

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不影响效率的,这个是用的.NET 的Activity Source。

RoleDialogModel responseMessage;

try
using (var activity = _telemetryService.StartCompletionActivity(null, _model, Provider, conversations, convService))
Copy link
Member

@yileicn yileicn Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要硬编码进去,建议使用AOP来实现
在python 中已经有opentelemetry-instrumentation-openai的包,采用如下方式来实现会比较好一些
services.AddOpenTelemetry()
.WithTracing(tracing => tracing
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddRedisInstrumentation()
.AddMongoDBInstrumentation()
.AddMySqlDataInstrumentation()
.AddOtlpExporter())
.WithMetrics(metrics => metrics
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddRedisInstrumentation()
.AddMongoDBInstrumentation()
.AddMySqlDataInstrumentation()
.AddOtlpExporter());

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是用的.NET 的Activity Source ,Activity 是 .NET 中用于表示单个操作(如 HTTP 请求或数据库调用)的类,对应 OpenTelemetry 中的 Span 概念,他和Python的不一样。 这个在semantic kernel 和 microsoft agent framework 都是用这个实现的,他是从 Activity 到 OpenTelemetry 。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenTelemetry(OTEL)的Semantic Conventions(语义约定)是一套标准化的属性命名规范,用于统一描述分布式系统中的监控数据(如 Traces、Metrics、Logs),确保不同工具生成的遥测数据具有一致的语义结构,从而提升可观测性系统的互操作性与分析效率。
https://github.com/open-telemetry/semantic-conventions
https://github.com/open-telemetry/semantic-conventions/tree/main/docs/gen-ai

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AOP来处理的是OpenTelemetry ,telemetryService 实现的是语义合约,并不是硬编码 ,具体参考semantic kernel的设计 https://github.com/microsoft/semantic-kernel/blob/main/docs/decisions/0044-OTel-semantic-convention.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants