Skip to content

Conversation

@avirajsingh7
Copy link
Collaborator

@avirajsingh7 avirajsingh7 commented Nov 27, 2025

Summary

Target issue is #438
This PR introduces Langfuse observability into the LLM provider execution flow by wrapping provider_instance.execute with a configurable decorator. This allows every LLM call to automatically generate:

  • A Langfuse trace
  • A generation event
  • Success/failure metadata
  • Token usage reporting
  • Optional session grouping via conversation_id

This enables unified tracing, debugging, and analytics across all LLM providers.

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

Please add here if any other information is required for the reviewer.

Summary by CodeRabbit

  • New Features

    • Added optional LLM execution observability that automatically traces interactions, model details, usage metrics, and outputs.
    • Session/conversation tracking to correlate related requests and attach session context to traces.
    • Observability automatically applied to LLM job execution when credentials are available.
  • Bug Fixes

    • Gracefully bypasses observability when credentials or client init fail and always flushes telemetry on success or error.

✏️ Tip: You can customize this high-level summary in your review settings.

@avirajsingh7 avirajsingh7 self-assigned this Nov 27, 2025
@avirajsingh7 avirajsingh7 added the enhancement New feature or request label Nov 27, 2025
@coderabbitai
Copy link

coderabbitai bot commented Nov 27, 2025

Walkthrough

Adds an observe_llm_execution decorator factory to emit Langfuse traces/generations around LLM provider execute calls when credentials are available, and integrates it into execute_job to wrap provider execution with optional session/conversation IDs.

Changes

Cohort / File(s) Change Summary
Langfuse tracing helper
backend/app/core/langfuse/langfuse.py
Adds `observe_llm_execution(session_id: str
LLM job execution integration
backend/app/services/llm/jobs.py
Imports get_provider_credential and observe_llm_execution; retrieves Langfuse credentials via get_provider_credential(org_id, project_id, provider="langfuse"), extracts conversation_id from request.query when present, decorates provider_instance.execute with observe_llm_execution, and calls the decorated execute function with the same parameters.

Sequence Diagram(s)

sequenceDiagram
    participant Job as execute_job
    participant Decorator as observe_llm_execution (wrapper)
    participant Langfuse as Langfuse Client
    participant Provider as LLM Provider

    Job->>Decorator: call decorated_execute(completion_config, query, include_provider_raw_response)
    alt credentials available & client init OK
        Decorator->>Langfuse: init client (credentials)
        Decorator->>Langfuse: create trace (session_id or conversation_id)
        Decorator->>Langfuse: create generation
        Decorator->>Provider: execute(completion_config, query, ...)
        alt success
            Provider-->>Decorator: response + usage
            Decorator->>Langfuse: update generation (output, usage, model)
            Decorator->>Langfuse: update trace (session context) & flush
            Decorator-->>Job: return (response, None)
        else error
            Provider-->>Decorator: exception
            Decorator->>Langfuse: flush data
            Decorator-->>Job: propagate error
        end
    else credentials missing or client init failed
        Decorator->>Provider: execute(...) (bypass observability)
        Provider-->>Decorator: response
        Decorator-->>Job: return (response, None)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Focus review on:
    • backend/app/core/langfuse/langfuse.py — client initialization, trace/generation lifecycle, error handling, and type usage (CompletionConfig, QueryParams, LLMCallResponse).
    • backend/app/services/llm/jobs.py — credential retrieval call, conversation_id extraction, and ensuring decorated execute signature matches existing callers.

Poem

🐰
I nudge the logs with velvet paws at night,
Wrapping each call in a ribbon of light.
When credentials hum, I mark every thread,
Flush tidy footprints where the tokens tread.
Hop—observability tucked into flight.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding Langfuse observability to the Unified API.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add_langfuse_llm_api

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

❌ Patch coverage is 86.04651% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/core/langfuse/langfuse.py 85.71% 5 Missing ⚠️
backend/app/services/llm/jobs.py 87.50% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
backend/app/core/langfuse/langfuse.py (2)

3-3: Optional: modernize typing imports and Dict usage to match Ruff hints.

Ruff is flagging Callable and Dict from typing; in Python 3.11+ you can simplify by importing Callable from collections.abc and using builtin dict[...] instead of Dict[...]. This is purely stylistic but will keep the module aligned with current best practices and avoid future deprecation noise.

Example (conceptual only):

-from typing import Any, Callable, Dict, Optional
+from collections.abc import Callable
+from typing import Any, Optional
...
-    input: Dict[str, Any],
-    metadata: Optional[Dict[str, Any]] = None,
+    input: dict[str, Any],
+    metadata: Optional[dict[str, Any]] = None,

Also applies to: 55-61, 73-78, 88-92


114-218: Tighten type hints on observe_llm_execution and its wrapper.

The decorator logic looks sound and preserves the original (response, error) contract, including graceful fallback when credentials are missing or client init fails. To better leverage type checking (and per project guidelines on type hints), consider adding explicit return types for the decorator and wrapper:

-def observe_llm_execution(
-    session_id: str | None = None,
-    credentials: dict | None = None,
-):
+def observe_llm_execution(
+    session_id: str | None = None,
+    credentials: dict | None = None,
+) -> Callable:
@@
-    def decorator(func: Callable) -> Callable:
+    def decorator(func: Callable) -> Callable:
@@
-        def wrapper(completion_config: CompletionConfig, query: QueryParams, **kwargs):
+        def wrapper(
+            completion_config: CompletionConfig,
+            query: QueryParams,
+            **kwargs,
+        ) -> tuple[LLMCallResponse | None, str | None]:

You can later narrow the Callable annotations if you want stronger guarantees, but even this minimal change makes the behavior clearer to tooling without affecting runtime.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1821b84 and 4c6a07b.

📒 Files selected for processing (2)
  • backend/app/core/langfuse/langfuse.py (2 hunks)
  • backend/app/services/llm/jobs.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

  • backend/app/core/langfuse/langfuse.py
  • backend/app/services/llm/jobs.py
backend/app/core/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place core functionality (config, DB session, security, exceptions, middleware) in backend/app/core/

Files:

  • backend/app/core/langfuse/langfuse.py
backend/app/core/langfuse/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place Langfuse observability integration under backend/app/core/langfuse/

Files:

  • backend/app/core/langfuse/langfuse.py
backend/app/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Implement business logic services under backend/app/services/

Files:

  • backend/app/services/llm/jobs.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: ProjectTech4DevAI/ai-platform PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-10-08T12:05:01.317Z
Learning: Applies to backend/app/core/langfuse/**/*.py : Place Langfuse observability integration under backend/app/core/langfuse/
📚 Learning: 2025-10-08T12:05:01.317Z
Learnt from: CR
Repo: ProjectTech4DevAI/ai-platform PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-10-08T12:05:01.317Z
Learning: Applies to backend/app/core/langfuse/**/*.py : Place Langfuse observability integration under backend/app/core/langfuse/

Applied to files:

  • backend/app/core/langfuse/langfuse.py
  • backend/app/services/llm/jobs.py
🧬 Code graph analysis (2)
backend/app/core/langfuse/langfuse.py (3)
backend/app/models/llm/request.py (2)
  • CompletionConfig (49-58)
  • QueryParams (35-46)
backend/app/models/llm/response.py (1)
  • LLMCallResponse (42-52)
backend/app/tests/services/llm/providers/test_openai.py (2)
  • completion_config (32-37)
  • provider (27-29)
backend/app/services/llm/jobs.py (3)
backend/app/crud/credentials.py (1)
  • get_provider_credential (121-159)
backend/app/core/langfuse/langfuse.py (1)
  • observe_llm_execution (114-218)
backend/app/services/llm/providers/base.py (1)
  • execute (35-55)
🪛 Ruff (0.14.6)
backend/app/core/langfuse/langfuse.py

3-3: Import from collections.abc instead: Callable

Import from collections.abc

(UP035)


3-3: typing.Dict is deprecated, use dict instead

(UP035)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (2)
backend/app/services/llm/jobs.py (2)

187-193: Confirm get_provider_credential supports provider=\"langfuse\".

This call assumes the credentials CRUD/validation layer recognizes "langfuse" as a valid provider; otherwise validate_provider inside get_provider_credential will raise and short‑circuit the LLM job before the actual provider executes.

Please double‑check that:

  • "langfuse" is included wherever provider names are validated, and
  • Langfuse credentials are stored with the expected shape so that decrypt_credentials returns the public_key / secret_key / host fields used in observe_llm_execution.

194-205: Verify provider/session lifetime and note clean fallback when Langfuse is absent.

decorated_execute is created and invoked after the with Session(engine) as session block has exited. That’s fine as long as:

  • get_llm_provider only uses the DB session during provider construction (e.g., to fetch credentials/config), and
  • provider_instance.execute does not depend on the original Session remaining open.

If any provider still uses the passed session during execute, it should instead manage its own short‑lived sessions internally, or decorated_execute should be moved back inside the with Session(...) block.

On the positive side, the decorator is correctly wired:

  • When langfuse_credentials is None or invalid, observe_llm_execution will call through to provider_instance.execute unchanged.
  • When credentials are valid, you get tracing without altering the external (response, error) behavior.

@avirajsingh7 avirajsingh7 linked an issue Nov 27, 2025 that may be closed by this pull request
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
backend/app/core/langfuse/langfuse.py (2)

173-175: Simplify variable declaration and assignment.

The separate type hint declarations on lines 173-174 followed by assignment on line 175 are unnecessary. Python's type inference combined with the function's return annotation provides sufficient typing.

Apply this diff:

-                # Execute the actual LLM call
-                response: LLMCallResponse | None
-                error: str | None
-                response, error = func(completion_config, query, **kwargs)
+                # Execute the actual LLM call
+                response, error = func(completion_config, query, **kwargs)

114-220: Consider leveraging the existing LangfuseTracer class to reduce duplication.

The decorator reimplements logic similar to LangfuseTracer (lines 14-111), including trace/generation creation, error handling, and flushing. Refactoring the decorator to use LangfuseTracer internally would improve maintainability and eliminate the duplicate error-handling blocks (lines 198-203 vs. 209-214).

Example refactor:

def observe_llm_execution(
    session_id: str | None = None,
    credentials: dict | None = None,
):
    def decorator(func: Callable) -> Callable:
        @wraps(func)
        def wrapper(completion_config: CompletionConfig, query: QueryParams, **kwargs):
            if not credentials or not all(
                key in credentials for key in ["public_key", "secret_key", "host"]
            ):
                logger.info("[Langfuse] No credentials - skipping observability")
                return func(completion_config, query, **kwargs)

            tracer = LangfuseTracer(credentials=credentials, session_id=session_id)
            
            # Use tracer methods for trace/generation lifecycle
            tracer.start_trace(
                name="unified-llm-call",
                input={"query": query.input},
                metadata={"provider": completion_config.provider},
                tags=[completion_config.provider],
            )
            tracer.start_generation(
                name=f"{completion_config.provider}-completion",
                input={"query": query.input},
            )
            
            try:
                response, error = func(completion_config, query, **kwargs)
                if response:
                    tracer.end_generation(
                        output={"status": "success", "output": response.response.output.text},
                        usage={"input": response.usage.input_tokens, "output": response.usage.output_tokens},
                        model=response.response.model,
                    )
                    tracer.update_trace(
                        tags=[completion_config.provider],
                        output={"status": "success", "output": response.response.output.text},
                    )
                else:
                    tracer.log_error(error or "Unknown error")
                
                tracer.flush()
                return response, error
            except Exception as e:
                tracer.log_error(str(e))
                tracer.flush()
                raise
        
        return wrapper
    return decorator
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4c6a07b and 5748d69.

📒 Files selected for processing (1)
  • backend/app/core/langfuse/langfuse.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

  • backend/app/core/langfuse/langfuse.py
backend/app/core/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place core functionality (config, DB session, security, exceptions, middleware) in backend/app/core/

Files:

  • backend/app/core/langfuse/langfuse.py
backend/app/core/langfuse/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place Langfuse observability integration under backend/app/core/langfuse/

Files:

  • backend/app/core/langfuse/langfuse.py
🧠 Learnings (1)
📚 Learning: 2025-10-08T12:05:01.317Z
Learnt from: CR
Repo: ProjectTech4DevAI/ai-platform PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-10-08T12:05:01.317Z
Learning: Applies to backend/app/core/langfuse/**/*.py : Place Langfuse observability integration under backend/app/core/langfuse/

Applied to files:

  • backend/app/core/langfuse/langfuse.py
🧬 Code graph analysis (1)
backend/app/core/langfuse/langfuse.py (2)
backend/app/models/llm/request.py (2)
  • CompletionConfig (49-58)
  • QueryParams (35-46)
backend/app/models/llm/response.py (1)
  • LLMCallResponse (42-52)
🪛 Ruff (0.14.6)
backend/app/core/langfuse/langfuse.py

3-3: Import from collections.abc instead: Callable

Import from collections.abc

(UP035)


3-3: typing.Dict is deprecated, use dict instead

(UP035)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (1)
backend/app/core/langfuse/langfuse.py (1)

183-186: The review comment is incorrect. usage_details is the correct and preferred parameter for Langfuse 2.60.3.

Based on verification:

  • Langfuse version 2.60.3 uses usage_details as the current v2/v3 standard parameter for generation.end()
  • The format {"input": ..., "output": ...} matches the expected generic-style structure
  • The usage parameter at line 95 is legacy/v1 style but remains backward-compatible
  • Both approaches work; usage_details is actually more modern and correct

No changes are needed. The code at lines 183-186 is properly implemented.

Comment on lines +135 to +137
if not credentials or not all(
key in credentials for key in ["public_key", "secret_key", "host"]
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this extensive check as if credentials are there then "public_key", "secret_key", "host" should be there else user can't save credentials

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its just a simple check, backend should not assume the stored object is always well-formed.
Credentials may come from: DB inconsistencies, migrations, manual edits, older versions of the system

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't understand and agree with it completely. Ideally we should ensure DB has correct credentials otherwise what's the point of saving credentials that are incorrect and whenever used raises error. That way we dont need to have checks when fetching credentials

Comment on lines +135 to +137
if not credentials or not all(
key in credentials for key in ["public_key", "secret_key", "host"]
):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't understand and agree with it completely. Ideally we should ensure DB has correct credentials otherwise what's the point of saving credentials that are incorrect and whenever used raises error. That way we dont need to have checks when fetching credentials

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
backend/app/core/langfuse/langfuse.py (2)

3-4: Use modern type hints for Python 3.11+.

Per the coding guidelines, import Callable from collections.abc and use the built-in dict instead of Dict. This was flagged in a previous review and by static analysis.

-from typing import Any, Callable, Dict, Optional
+from collections.abc import Callable
+from typing import Any, Optional
 from functools import wraps

Then replace all Dict occurrences (lines 58, 76, 90, 91, 97) with dict.


152-156: Pass session_id to trace creation for proper session grouping.

The session_id is only set via trace.update() later, which means traces are created without session grouping initially. Include session_id in trace creation for conversation-level analytics to work correctly from the start.

             trace = langfuse.trace(
                 name="unified-llm-call",
                 input=query.input,
                 tags=[completion_config.provider],
+                session_id=session_id,
             )
🧹 Nitpick comments (2)
backend/app/core/langfuse/langfuse.py (2)

152-156: Add request correlation metadata for debugging.

Unlike LangfuseTracer.start_trace which includes request_id from correlation_id.get() in metadata, this trace is created without metadata. For consistent observability and easier debugging, include the request correlation ID.

+            trace_metadata = {"request_id": correlation_id.get() or "N/A"}
+
             trace = langfuse.trace(
                 name="unified-llm-call",
                 input=query.input,
+                metadata=trace_metadata,
                 tags=[completion_config.provider],
+                session_id=session_id,
             )

131-133: Add return type hints for better type safety.

The wrapper function has typed parameters but lacks a return type annotation. Based on the code, it returns tuple[LLMCallResponse | None, str | None].

-    def decorator(func: Callable) -> Callable:
+    def decorator(func: Callable[..., tuple[LLMCallResponse | None, str | None]]) -> Callable[..., tuple[LLMCallResponse | None, str | None]]:
         @wraps(func)
-        def wrapper(completion_config: CompletionConfig, query: QueryParams, **kwargs):
+        def wrapper(completion_config: CompletionConfig, query: QueryParams, **kwargs) -> tuple[LLMCallResponse | None, str | None]:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5748d69 and 5631985.

📒 Files selected for processing (1)
  • backend/app/core/langfuse/langfuse.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

  • backend/app/core/langfuse/langfuse.py
backend/app/core/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place core functionality (config, DB session, security, exceptions, middleware) in backend/app/core/

Files:

  • backend/app/core/langfuse/langfuse.py
backend/app/core/langfuse/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place Langfuse observability integration under backend/app/core/langfuse/

Files:

  • backend/app/core/langfuse/langfuse.py
🧬 Code graph analysis (1)
backend/app/core/langfuse/langfuse.py (3)
backend/app/models/llm/request.py (2)
  • CompletionConfig (49-58)
  • QueryParams (35-46)
backend/app/models/llm/response.py (1)
  • LLMCallResponse (42-52)
backend/app/tests/services/llm/providers/test_openai.py (2)
  • completion_config (32-37)
  • provider (27-29)
🪛 GitHub Actions: AI Platform CI
backend/app/core/langfuse/langfuse.py

[error] 1-1: Black formatting failed. Reformatted 1 file during pre-commit. Exit code 1.

🪛 Ruff (0.14.8)
backend/app/core/langfuse/langfuse.py

3-3: Import from collections.abc instead: Callable

Import from collections.abc

(UP035)


3-3: typing.Dict is deprecated, use dict instead

(UP035)

🔇 Additional comments (1)
backend/app/core/langfuse/langfuse.py (1)

164-209: LGTM: Core execution logic is well-structured.

The decorator properly handles:

  • Success path with generation output and usage tracking
  • Error path when response is None
  • Exception path with proper cleanup and re-raise
  • Langfuse flush in all exit paths

Comment on lines 149 to 151
return func(completion_config, query, **kwargs)


trace = langfuse.trace(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix Black formatting: remove extra blank line.

The pipeline failure indicates Black formatting failed. There's an extra blank line between line 149 and 152. Remove one blank line to fix the formatting.

             except Exception as e:
                 logger.warning(f"[Langfuse] Failed to initialize client: {e}")
                 return func(completion_config, query, **kwargs)
-
 
             trace = langfuse.trace(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return func(completion_config, query, **kwargs)
trace = langfuse.trace(
return func(completion_config, query, **kwargs)
trace = langfuse.trace(
🤖 Prompt for AI Agents
In backend/app/core/langfuse/langfuse.py around lines 149 to 152, remove the
extra blank line between the return statement and the subsequent trace
invocation so the two lines are adjacent; reformat the file (or run Black) to
ensure no additional formatting issues remain.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
backend/app/core/langfuse/langfuse.py (2)

3-4: Use modern type hints for Python 3.11+.

This issue was already flagged in a previous review. Import Callable from collections.abc instead of typing, and replace Dict with the built-in dict type throughout the file.


151-155: Pass session_id to trace creation for proper session grouping.

This issue was already flagged in a previous review. The session_id parameter should be included when creating the trace so that session grouping works from the start, rather than being set later via trace.update().

🧹 Nitpick comments (4)
backend/app/core/langfuse/langfuse.py (4)

114-117: Add return type annotation to the decorator factory.

The function signature is missing a return type annotation. Since this is a decorator factory that returns a decorator, add -> Callable to improve type safety.

Apply this diff:

 def observe_llm_execution(
     session_id: str | None = None,
     credentials: dict | None = None,
-):
+) -> Callable:

131-133: Add return type annotations to nested functions.

Both the decorator and wrapper functions are missing return type annotations. Add them for better type safety and IDE support.

Apply this diff:

-    def decorator(func: Callable) -> Callable:
+    def decorator(func: Callable) -> Callable:
         @wraps(func)
-        def wrapper(completion_config: CompletionConfig, query: QueryParams, **kwargs):
+        def wrapper(completion_config: CompletionConfig, query: QueryParams, **kwargs) -> tuple[LLMCallResponse | None, str | None]:

141-146: Use direct key access after validation.

Since you've already validated that public_key, secret_key, and host exist in credentials (lines 135-137), use direct dictionary access (credentials["key"]) instead of .get() to avoid unnecessary Optional types and make the code clearer.

Apply this diff:

             try:
                 langfuse = Langfuse(
-                    public_key=credentials.get("public_key"),
-                    secret_key=credentials.get("secret_key"),
-                    host=credentials.get("host"),
+                    public_key=credentials["public_key"],
+                    secret_key=credentials["secret_key"],
+                    host=credentials["host"],
                 )

114-212: Consider refactoring to use the existing LangfuseTracer class.

The observe_llm_execution decorator reimplements trace and generation management logic that already exists in the LangfuseTracer class above (lines 14-111). This creates code duplication and maintenance overhead. Consider refactoring the decorator to instantiate and use LangfuseTracer instead, which would provide a consistent pattern across the codebase and reduce duplication.

For example:

def decorator(func: Callable) -> Callable:
    @wraps(func)
    def wrapper(completion_config: CompletionConfig, query: QueryParams, **kwargs):
        tracer = LangfuseTracer(credentials=credentials, session_id=session_id)
        
        tracer.start_trace(
            name="unified-llm-call",
            input=query.input,
            tags=[completion_config.provider]
        )
        
        tracer.start_generation(
            name=f"{completion_config.provider}-completion",
            input=query.input,
            metadata={"model": completion_config.params.get("model")}
        )
        
        try:
            response, error = func(completion_config, query, **kwargs)
            
            if response:
                tracer.end_generation(
                    output={"status": "success", "output": response.response.output.text},
                    usage={...},
                    model=response.response.model
                )
                tracer.update_trace(
                    tags=[],
                    output={"status": "success", "output": response.response.output.text}
                )
            else:
                tracer.log_error(error or "Unknown error")
                
            tracer.flush()
            return response, error
            
        except Exception as e:
            tracer.log_error(str(e))
            tracer.flush()
            raise
    
    return wrapper
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5631985 and 8318707.

📒 Files selected for processing (2)
  • backend/app/core/langfuse/langfuse.py (2 hunks)
  • backend/app/services/llm/jobs.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/app/services/llm/jobs.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

  • backend/app/core/langfuse/langfuse.py
backend/app/core/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place core functionality (config, DB session, security, exceptions, middleware) in backend/app/core/

Files:

  • backend/app/core/langfuse/langfuse.py
backend/app/core/langfuse/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place Langfuse observability integration under backend/app/core/langfuse/

Files:

  • backend/app/core/langfuse/langfuse.py
🧬 Code graph analysis (1)
backend/app/core/langfuse/langfuse.py (3)
backend/app/models/llm/request.py (1)
  • CompletionConfig (49-58)
backend/app/models/llm/response.py (1)
  • LLMCallResponse (42-52)
backend/app/tests/services/llm/providers/test_openai.py (2)
  • completion_config (32-37)
  • provider (27-29)
🪛 Ruff (0.14.8)
backend/app/core/langfuse/langfuse.py

3-3: Import from collections.abc instead: Callable

Import from collections.abc

(UP035)


3-3: typing.Dict is deprecated, use dict instead

(UP035)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (3)
backend/app/core/langfuse/langfuse.py (3)

157-161: LGTM!

The generation is correctly created as a child of the trace, with appropriate input and model information extracted from the configuration.


200-208: LGTM!

The exception handling correctly logs the error to Langfuse, flushes the data, and re-raises the exception to preserve the original error flow. This ensures observability without hiding failures.


169-180: No changes needed. The usage_details parameter is correct according to the Langfuse SDK documentation. The code at line 175 properly uses usage_details with the correct structure ({"input": ..., "output": ...}).

Likely an incorrect or invalid review comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready-for-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate Observability with Unified API

4 participants