Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 20 additions & 8 deletions docs/openapi.json
Original file line number Diff line number Diff line change
Expand Up @@ -3664,7 +3664,7 @@
"rlsapi-v1"
],
"summary": "Infer Endpoint",
"description": "Handle rlsapi v1 /infer requests for stateless inference.\n\nThis endpoint serves requests from the RHEL Lightspeed Command Line Assistant (CLA).\n\nAccepts a question with optional context (stdin, attachments, terminal output,\nsystem info) and returns an LLM-generated response.\n\nArgs:\n infer_request: The inference request containing question and context.\n auth: Authentication tuple from the configured auth provider.\n\nReturns:\n RlsapiV1InferResponse containing the generated response text and request ID.\n\nRaises:\n HTTPException: 503 if the LLM service is unavailable.",
"description": "Handle rlsapi v1 /infer requests for stateless inference.\n\nThis endpoint serves requests from the RHEL Lightspeed Command Line Assistant (CLA).\n\nAccepts a question with optional context (stdin, attachments, terminal output,\nsystem info) and returns an LLM-generated response.\n\nArgs:\n infer_request: The inference request containing question and context.\n request: The FastAPI request object for accessing headers and state.\n background_tasks: FastAPI background tasks for async Splunk event sending.\n auth: Authentication tuple from the configured auth provider.\n\nReturns:\n RlsapiV1InferResponse containing the generated response text and request ID.\n\nRaises:\n HTTPException: 503 if the LLM service is unavailable.",
"operationId": "infer_endpoint_v1_infer_post",
"requestBody": {
"content": {
Expand Down Expand Up @@ -4290,7 +4290,7 @@
],
"summary": "Handle A2A Jsonrpc",
"description": "Handle A2A JSON-RPC requests following the A2A protocol specification.\n\nThis endpoint uses the DefaultRequestHandler from the A2A SDK to handle\nall JSON-RPC requests including message/send, message/stream, etc.\n\nThe A2A SDK application is created per-request to include authentication\ncontext while still leveraging FastAPI's authorization middleware.\n\nAutomatically detects streaming requests (message/stream JSON-RPC method)\nand returns a StreamingResponse to enable real-time chunk delivery.\n\nArgs:\n request: FastAPI request object\n auth: Authentication tuple\n mcp_headers: MCP headers for context propagation\n\nReturns:\n JSON-RPC response or streaming response",
"operationId": "handle_a2a_jsonrpc_a2a_get",
"operationId": "handle_a2a_jsonrpc_a2a_post",
"responses": {
"200": {
"description": "Successful Response",
Expand All @@ -4308,7 +4308,7 @@
],
"summary": "Handle A2A Jsonrpc",
"description": "Handle A2A JSON-RPC requests following the A2A protocol specification.\n\nThis endpoint uses the DefaultRequestHandler from the A2A SDK to handle\nall JSON-RPC requests including message/send, message/stream, etc.\n\nThe A2A SDK application is created per-request to include authentication\ncontext while still leveraging FastAPI's authorization middleware.\n\nAutomatically detects streaming requests (message/stream JSON-RPC method)\nand returns a StreamingResponse to enable real-time chunk delivery.\n\nArgs:\n request: FastAPI request object\n auth: Authentication tuple\n mcp_headers: MCP headers for context propagation\n\nReturns:\n JSON-RPC response or streaming response",
"operationId": "handle_a2a_jsonrpc_a2a_get",
"operationId": "handle_a2a_jsonrpc_a2a_post",
"responses": {
"200": {
"description": "Successful Response",
Expand Down Expand Up @@ -5335,11 +5335,11 @@
"description": "Dimensionality of embedding vectors.",
"default": 768
},
"vector_db_id": {
"vector_store_id": {
"type": "string",
"minLength": 1,
"title": "Vector DB ID",
"description": "Vector DB identification."
"title": "Vector Store ID",
"description": "Vector store identification."
},
"db_path": {
"type": "string",
Expand All @@ -5352,7 +5352,7 @@
"type": "object",
"required": [
"rag_id",
"vector_db_id",
"vector_store_id",
"db_path"
],
"title": "ByokRag",
Expand Down Expand Up @@ -8431,11 +8431,23 @@
],
"title": "Doc Title",
"description": "Title of the referenced document"
},
"doc_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"title": "Doc Id",
"description": "ID of the referenced document"
}
},
"type": "object",
"title": "ReferencedDocument",
"description": "Model representing a document referenced in generating a response.\n\nAttributes:\n doc_url: Url to the referenced doc.\n doc_title: Title of the referenced doc."
"description": "Model representing a document referenced in generating a response.\n\nAttributes:\n doc_url: Url to the referenced doc.\n doc_title: Title of the referenced doc.\n doc_id: ID of the referenced doc."
},
"RlsapiV1Attachment": {
"properties": {
Expand Down
6 changes: 5 additions & 1 deletion docs/openapi.md
Original file line number Diff line number Diff line change
Expand Up @@ -3200,6 +3200,8 @@ system info) and returns an LLM-generated response.

Args:
infer_request: The inference request containing question and context.
request: The FastAPI request object for accessing headers and state.
background_tasks: FastAPI background tasks for async Splunk event sending.
auth: Authentication tuple from the configured auth provider.

Returns:
Expand Down Expand Up @@ -4184,7 +4186,7 @@ BYOK (Bring Your Own Knowledge) RAG configuration.
| rag_type | string | Type of RAG database. |
| embedding_model | string | Embedding model identification |
| embedding_dimension | integer | Dimensionality of embedding vectors. |
| vector_db_id | string | Vector DB identification. |
| vector_store_id | string | Vector store identification. |
| db_path | string | Path to RAG database. |


Expand Down Expand Up @@ -5316,12 +5318,14 @@ Model representing a document referenced in generating a response.
Attributes:
doc_url: Url to the referenced doc.
doc_title: Title of the referenced doc.
doc_id: ID of the referenced doc.


| Field | Type | Description |
|-------|------|-------------|
| doc_url | | URL of the referenced document |
| doc_title | | Title of the referenced document |
| doc_id | | ID of the referenced document |


## RlsapiV1Attachment
Expand Down
37 changes: 0 additions & 37 deletions src/app/endpoints/query.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
"""Handler for REST API call to provide answer to query."""

import ast
import logging
import re
from datetime import UTC, datetime
from typing import Annotated, Any, Optional

Expand All @@ -14,7 +12,6 @@
RateLimitError, # type: ignore
)
from llama_stack_client.types.model_list_response import ModelListResponse
from llama_stack_client.types.shared.interleaved_content_item import TextContentItem
from sqlalchemy.exc import SQLAlchemyError

import constants
Expand All @@ -36,7 +33,6 @@
PromptTooLongResponse,
QueryResponse,
QuotaExceededResponse,
ReferencedDocument,
ServiceUnavailableResponse,
UnauthorizedResponse,
UnprocessableEntityResponse,
Expand Down Expand Up @@ -553,39 +549,6 @@ def is_input_shield(shield: Shield) -> bool:
return _is_inout_shield(shield) or not is_output_shield(shield)


def parse_metadata_from_text_item(
text_item: TextContentItem,
) -> Optional[ReferencedDocument]:
"""
Parse a single TextContentItem to extract referenced documents.

Args:
text_item (TextContentItem): The TextContentItem containing metadata.

Returns:
ReferencedDocument: A ReferencedDocument object containing 'doc_url' and 'doc_title'
representing the referenced documents found in the metadata.
"""
docs: list[ReferencedDocument] = []
if not isinstance(text_item, TextContentItem):
return docs

metadata_blocks = re.findall(
r"Metadata:\s*({.*?})(?:\n|$)", text_item.text, re.DOTALL
)
for block in metadata_blocks:
try:
data = ast.literal_eval(block)
url = data.get("docs_url")
title = data.get("title")
if url and title:
return ReferencedDocument(doc_url=url, doc_title=title)
logger.debug("Invalid metadata block (missing url or title): %s", block)
except (ValueError, SyntaxError) as e:
logger.debug("Failed to parse metadata block: %s | Error: %s", block, e)
return None


def validate_attachments_metadata(attachments: list[Attachment]) -> None:
"""Validate the attachments metadata provided in the request.

Expand Down
71 changes: 14 additions & 57 deletions src/app/endpoints/query_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -541,11 +541,11 @@ def parse_referenced_documents_from_responses_api(
response: The OpenAI Response API response object

Returns:
list[ReferencedDocument]: List of referenced documents with doc_url and doc_title
list[ReferencedDocument]: List of referenced documents with doc_url, doc_title and doc_id
"""
documents: list[ReferencedDocument] = []
# Use a set to track unique documents by (doc_url, doc_title) tuple
seen_docs: set[tuple[Optional[str], Optional[str]]] = set()
# Use a set to track unique documents by (doc_url, doc_title, doc_id) tuple
seen_docs: set[tuple[Optional[str], Optional[str], Optional[str]]] = set()

# Handle None response (e.g., when agent fails)
if response is None or not response.output:
Expand All @@ -560,74 +560,31 @@ def parse_referenced_documents_from_responses_api(
for result in results:
# Handle both object and dict access
if isinstance(result, dict):
filename = result.get("filename")
attributes = result.get("attributes", {})
else:
filename = getattr(result, "filename", None)
attributes = getattr(result, "attributes", {})

# Try to get URL from attributes
# Look for common URL fields in attributes
doc_url = (
attributes.get("link")
attributes.get("doc_url")
or attributes.get("docs_url")
or attributes.get("url")
or attributes.get("doc_url")
or attributes.get("link")
)
doc_title = attributes.get("title")
doc_id = attributes.get("document_id") or attributes.get("doc_id")

# If we have at least a filename or url
if filename or doc_url:
if doc_title or doc_url:
# Treat empty string as None for URL to satisfy Optional[AnyUrl]
final_url = doc_url if doc_url else None
if (final_url, filename) not in seen_docs:
if (final_url, doc_title, doc_id) not in seen_docs:
documents.append(
ReferencedDocument(doc_url=final_url, doc_title=filename)
)
seen_docs.add((final_url, filename))

# 2. Parse from message content annotations
Copy link
Contributor

@asimurka asimurka Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for dropping this section?
Check also OpenAIResponseOutputMessageContent type that can be part of OpenAIResponseMessage content. Isn't this also relevant? More specifically, OpenAIResponseAnnotations object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already get the references to the chunks used for the response.

Annotations is useful if we want to determine exactly which files the model has used to give the answer (determined by the model) and at which position in the text.

Personally I think that we can use this data, but right now it is not needed. Because the scope of referenced_docs is to keep data of all chunks retrieved and used as input for reference. The data that we get from citation is just redundant since it is a small subset of these docs.

I think we can make use of citations in a separate feature, after agreement across our customers since it adds a new functionality.

elif item_type == "message":
content = getattr(output_item, "content", None)
if isinstance(content, list):
for part in content:
# Skip if part is a string or doesn't have annotations
if isinstance(part, str):
continue

annotations = getattr(part, "annotations", []) or []
for annotation in annotations:
# Handle both object and dict access for annotations
if isinstance(annotation, dict):
anno_type = annotation.get("type")
anno_url = annotation.get("url")
anno_title = annotation.get("title") or annotation.get(
"filename"
)
else:
anno_type = getattr(annotation, "type", None)
anno_url = getattr(annotation, "url", None)
anno_title = getattr(annotation, "title", None) or getattr(
annotation, "filename", None
ReferencedDocument(
doc_url=final_url, doc_title=doc_title, doc_id=doc_id
)

if anno_type == "url_citation":
# Treat empty string as None
final_url = anno_url if anno_url else None
if (final_url, anno_title) not in seen_docs:
documents.append(
ReferencedDocument(
doc_url=final_url, doc_title=anno_title
)
)
seen_docs.add((final_url, anno_title))

elif anno_type == "file_citation":
if (None, anno_title) not in seen_docs:
documents.append(
ReferencedDocument(
doc_url=None, doc_title=anno_title
)
)
seen_docs.add((None, anno_title))
)
Comment on lines +575 to +586
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the actual code around lines 575-586 in query_v2.py
cat -n src/app/endpoints/query_v2.py | sed -n '570,590p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 1147


🏁 Script executed:

# Let's also check the ReferencedDocument class definition to understand its structure
rg -A 10 'class ReferencedDocument' src/

Repository: lightspeed-core/lightspeed-stack

Length of output: 708


🏁 Script executed:

# Let's see more context around this code section to understand the full logic
cat -n src/app/endpoints/query_v2.py | sed -n '550,600p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 2444


🏁 Script executed:

# Get full ReferencedDocument class definition
cat -n src/models/responses.py | grep -A 20 'class ReferencedDocument'

Repository: lightspeed-core/lightspeed-stack

Length of output: 840


🏁 Script executed:

# Search for where ReferencedDocument is used to understand the intent
rg 'ReferencedDocument' src/ --type py -B 2 -A 2

Repository: lightspeed-core/lightspeed-stack

Length of output: 10260


🏁 Script executed:

# Check if there are any tests that show expected behavior with doc_id-only documents
fd 'test.*query' src/ --type py | head -5

Repository: lightspeed-core/lightspeed-stack

Length of output: 245


Include doc_id-only documents to prevent data loss.

Documents with only document_id but no title or url are currently filtered out. Since doc_id is explicitly extracted and used in deduplication, it should be preserved. Update the condition to include doc_id:

Suggested fix
-                if doc_title or doc_url:
+                if doc_title or doc_url or doc_id:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
doc_title = attributes.get("title")
doc_id = attributes.get("document_id") or attributes.get("doc_id")
# If we have at least a filename or url
if filename or doc_url:
if doc_title or doc_url:
# Treat empty string as None for URL to satisfy Optional[AnyUrl]
final_url = doc_url if doc_url else None
if (final_url, filename) not in seen_docs:
if (final_url, doc_title, doc_id) not in seen_docs:
documents.append(
ReferencedDocument(doc_url=final_url, doc_title=filename)
)
seen_docs.add((final_url, filename))
# 2. Parse from message content annotations
elif item_type == "message":
content = getattr(output_item, "content", None)
if isinstance(content, list):
for part in content:
# Skip if part is a string or doesn't have annotations
if isinstance(part, str):
continue
annotations = getattr(part, "annotations", []) or []
for annotation in annotations:
# Handle both object and dict access for annotations
if isinstance(annotation, dict):
anno_type = annotation.get("type")
anno_url = annotation.get("url")
anno_title = annotation.get("title") or annotation.get(
"filename"
)
else:
anno_type = getattr(annotation, "type", None)
anno_url = getattr(annotation, "url", None)
anno_title = getattr(annotation, "title", None) or getattr(
annotation, "filename", None
ReferencedDocument(
doc_url=final_url, doc_title=doc_title, doc_id=doc_id
)
if anno_type == "url_citation":
# Treat empty string as None
final_url = anno_url if anno_url else None
if (final_url, anno_title) not in seen_docs:
documents.append(
ReferencedDocument(
doc_url=final_url, doc_title=anno_title
)
)
seen_docs.add((final_url, anno_title))
elif anno_type == "file_citation":
if (None, anno_title) not in seen_docs:
documents.append(
ReferencedDocument(
doc_url=None, doc_title=anno_title
)
)
seen_docs.add((None, anno_title))
)
doc_title = attributes.get("title")
doc_id = attributes.get("document_id") or attributes.get("doc_id")
if doc_title or doc_url or doc_id:
# Treat empty string as None for URL to satisfy Optional[AnyUrl]
final_url = doc_url if doc_url else None
if (final_url, doc_title, doc_id) not in seen_docs:
documents.append(
ReferencedDocument(
doc_url=final_url, doc_title=doc_title, doc_id=doc_id
)
)
🤖 Prompt for AI Agents
In `@src/app/endpoints/query_v2.py` around lines 575 - 586, The current filter
only appends ReferencedDocument when doc_title or doc_url exists, which drops
entries that only have doc_id; change the if-condition to check any of
doc_title, doc_url, or doc_id so documents with only a document_id are
preserved. Keep the existing final_url normalization (treat empty string as
None), use the same dedupe tuple (final_url, doc_title, doc_id) against
seen_docs, and append ReferencedDocument(doc_url=final_url, doc_title=doc_title,
doc_id=doc_id) when that tuple is not in seen_docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

referenced_docs are used because we need to show in the frontend what document have been referenced, IMO it does not make sense to return empty documents without TItle / Url

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@are-ces, that makes perfect sense! You're right that a document reference with only an ID wouldn't provide meaningful information for the frontend to display. The current condition ensures that only documents with at least a title or URL—something actionable and human-readable—are returned to the UI. Thank you for the clarification!


✏️ Learnings added
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 1073
File: src/app/endpoints/query_v2.py:575-586
Timestamp: 2026-01-30T13:33:34.749Z
Learning: In `src/app/endpoints/query_v2.py`, the `parse_referenced_documents_from_responses_api` function filters referenced documents to include only those with `doc_title` or `doc_url` because these documents are displayed in the frontend. Documents with only `doc_id` are intentionally excluded as they wouldn't provide useful information to end users.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

seen_docs.add((final_url, doc_title, doc_id))

return documents

Expand Down
Loading
Loading