Skip to content

Conversation

@dpage
Copy link
Contributor

@dpage dpage commented Dec 17, 2025

This is a series of 4 commits that add:

  • Core LLM integration infrastructure
  • AI generated reports for performance, security, and design on servers, databases, and schemas where appropriate.
  • An AI Assistant in the Query Tool to help with query generation.
  • An AI Insights panel on the Query Tool's EXPLAIN UI to provide analysis and recommendations from query plans.

Support is included for use with Anthropic and OpenAI in the cloud (for best results), or with Ollama or Docker Model Runner on local infrastructure (including the same machine), with models such as qwen3-coder or gemma.

Summary by CodeRabbit

  • New Features

    • Added AI Reports feature for security, performance, and design analysis across servers, databases, and schemas.
    • Added AI Assistant to SQL Editor for natural language query generation.
    • Added AI-powered analysis of EXPLAIN plans with bottleneck detection and recommendations.
    • Added support for multiple LLM providers: Anthropic, OpenAI, Ollama, and Docker.
    • Added AI preferences configuration for managing providers and models.
  • Documentation

    • Added comprehensive AI tools, preferences, and query tool documentation.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 17, 2025

Walkthrough

This PR introduces a complete AI/LLM integration framework to pgAdmin, enabling AI-assisted database analysis and query generation. It includes configuration management for multiple LLM providers (Anthropic, OpenAI, Ollama, Docker), a backend report generation pipeline for security/performance/design analysis, frontend UI components for streaming reports and natural language queries, database tool integrations, and comprehensive test coverage.

Changes

Cohort / File(s) Summary
Configuration & System Setup
web/config.py, web/pgadmin/submodules.py, web/pgadmin/tools/user_management/PgAdminPermissions.py, web/pgadmin/utils/constants.py, web/pgadmin/browser/static/js/constants.js
Added LLM configuration variables (provider keys, models, URLs, iterations limit), registered LLMModule as a submodule, introduced tools_ai permission constant, added AI preference label, and exposed new browser panel/permission constants.
Database Migration
web/migrations/versions/add_tools_ai_permission_.py
Added Alembic migration to inject tools_ai permission into existing database roles.
Dependencies
web/package.json, web/jest.config.js
Added marked package dependency and configured Jest to transform it via imports-loader.
LLM Core Infrastructure
web/pgadmin/llm/models.py, web/pgadmin/llm/client.py, web/pgadmin/llm/utils.py
Defined standardized data models (Role, Message, ToolCall, LLMResponse, etc.), implemented abstract LLMClient interface with factory pattern, and centralized configuration/preference access for all LLM providers.
LLM Providers
web/pgadmin/llm/providers/{anthropic,openai,ollama,docker}.py, web/pgadmin/llm/providers/__init__.py
Implemented provider-specific clients (AnthropicClient, OpenAIClient, OllamaClient, DockerClient) with message/tool conversion, HTTP handling, and error mapping.
LLM Chat & Prompts
web/pgadmin/llm/chat.py, web/pgadmin/llm/prompts/{__init__,nlq,explain}.py
Added chat workflow with tool iteration support, and defined system prompts for NLQ translation and EXPLAIN analysis.
Report Generation Pipeline
web/pgadmin/llm/reports/{models,pipeline,prompts,queries,sections,generator}.py, web/pgadmin/llm/reports/__init__.py
Implemented multi-stage report generation (planning, gathering, analysis, synthesis) with retry logic, section definitions for three report types, query registry, and streaming/sync generators.
Database Tools
web/pgadmin/llm/tools/{__init__,database}.py
Created read-only database introspection tools for LLMs (schema inspection, table info, column metadata) with connection management and version-aware SQL templates.
Backend API Endpoints
web/pgadmin/llm/__init__.py, web/pgadmin/tools/sqleditor/__init__.py
Registered LLM blueprint with endpoints for model discovery, status checks, report generation (streaming/sync), and NLQ/EXPLAIN streaming endpoints in SQL editor.
Frontend React Components
web/pgadmin/llm/static/js/{AIReport,ai_tools}.jsx, web/pgadmin/static/js/Explain/{AIInsights,index}.jsx, web/pgadmin/tools/sqleditor/static/js/components/sections/{NLQChatPanel,Query,ResultSet}.jsx
Added AI report viewer with streaming support, security/performance/design report UIs, NLQ chat panel with SQL preview, EXPLAIN analysis component, and corresponding event handlers/state management.
Preference & Component Helpers
web/pgadmin/static/js/components/{PreferencesHelper,SelectRefresh,FormComponents}.jsx, web/pgadmin/tools/sqleditor/static/js/components/{QueryToolConstants,QueryToolComponent}.jsx
Enhanced preference system to support dynamic model loading and refresh actions, added AI assistant panel identifier, and wired NLQChatPanel into query tool layout.
Webpack Configuration
web/webpack.config.js, web/webpack.shim.js
Added marked package to imports-loader and registered ai_tools module alias.
Python Tests
web/pgadmin/llm/tests/{__init__,README,test_llm_status,test_report_endpoints}.py, web/pgadmin/tools/sqleditor/tests/{test_nlq_chat,test_explain_analyze_ai}.py
Added unit tests for LLM status endpoint, report generation, NLQ chat, and EXPLAIN analysis with mocked external API calls.
JavaScript Tests
web/regression/javascript/{Explain/AIInsights.spec.js,llm/AIReport.spec.js,sqleditor/NLQChatPanel.spec.js}
Added comprehensive Jest test suites for AI components covering rendering, streaming, state management, and error handling.
Documentation
docs/en_US/{ai_tools,developer_tools,menu_bar,preferences,query_tool}.rst, web/pgadmin/llm/README.md
Added user-facing documentation for AI features (configuration, usage, troubleshooting) and developer guide.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas requiring extra attention:

  • web/pgadmin/llm/init.py — Largest file with extensive endpoint definitions, model fetching logic, and report generation orchestration; complex error handling and provider-specific workflows.
  • web/pgadmin/llm/reports/pipeline.py — Multi-stage report generation with retry mechanisms, LLM call orchestration, and progress event emission; dense logic with multiple feedback loops.
  • web/pgadmin/llm/providers/{anthropic,openai,docker}.py — Provider integrations with message/tool conversion, HTTP error mapping, and response parsing; verify correctness of tool_call handling and error propagation.
  • web/pgadmin/llm/tools/database.py — Database connection lifecycle, read-only enforcement, and schema introspection logic; ensure connection cleanup and SQL injection prevention via parameterization.
  • web/pgadmin/llm/static/js/AIReport.jsx & NLQChatPanel.jsx — Complex React state management, SSE streaming, event handling; verify proper cleanup on unmount and AbortController usage.
  • web/pgadmin/static/js/components/SelectRefresh.jsx — Enhanced to support API-driven refresh with dependent fields; verify SchemaStateContext integration and error fallback behavior.
  • web/pgadmin/tools/sqleditor/init.py — New NLQ and EXPLAIN streaming endpoints; verify session validation, transaction state handling, and SSE response formatting.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.88% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add AI functionality to pgAdmin' is a clear, concise summary of the main change—introducing comprehensive LLM-based AI features across the application.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web/pgadmin/static/js/components/FormComponents.jsx (1)

932-964: Add callbacks to the dependency array or stabilize them with useCallback.

The effect uses onError, onChange, and optionsLoaded callbacks that are not included in the dependency array [optionsReloadBasis, mountId]. Since mountId never changes after mount, the effect only runs when optionsReloadBasis changes. If parent components pass new callback instances on re-renders (common with inline arrow functions like onChange={(val) => { ... }}), the effect will capture stale closures. When the async promise resolves and calls these callbacks, it will invoke the old function references rather than the current ones.

Either add these callbacks to the dependency array or ensure parent components wrap them in useCallback to maintain stable references.

🟡 Minor comments (16)
web/pgadmin/tools/sqleditor/tests/test_explain_analyze_ai.py-71-72 (1)

71-72: Missing super().setUp() call in test class.

The BaseTestGenerator.setUp performs important setup including server connection and mock data processing. The empty setUp override prevents this.

Apply this diff:

 def setUp(self):
-    pass
+    super().setUp()
web/pgadmin/tools/sqleditor/tests/test_explain_analyze_ai.py-99-107 (1)

99-107: Fix line length violations flagged by CI pipeline.

Lines 101 and 106 exceed the 79-character limit required by pycodestyle.

Apply this diff to fix the formatting:

-            mock_check_trans = patch(
-                'pgadmin.tools.sqleditor.check_transaction_status',
-                return_value=(True, None, mock_conn, mock_trans_obj, mock_session)
-            )
+            mock_check_trans = patch(
+                'pgadmin.tools.sqleditor.check_transaction_status',
+                return_value=(
+                    True, None, mock_conn, mock_trans_obj, mock_session
+                )
+            )
         else:
-            mock_check_trans = patch(
-                'pgadmin.tools.sqleditor.check_transaction_status',
-                return_value=(False, 'Transaction ID not found', None, None, None)
-            )
+            mock_check_trans = patch(
+                'pgadmin.tools.sqleditor.check_transaction_status',
+                return_value=(
+                    False, 'Transaction ID not found', None, None, None
+                )
+            )
web/pgadmin/llm/static/js/AIReport.jsx-555-563 (1)

555-563: Missing dependencies in useEffect may cause stale closures.

The effect references generateReport and closeEventSource but doesn't list them in the dependency array. This could lead to stale function references.

Add the missing dependencies:

 useEffect(() => {
   // Generate report on mount
   generateReport();

   // Cleanup on unmount
   return () => {
     closeEventSource();
   };
-}, [sid, did, scid, reportCategory, reportType]);
+}, [sid, did, scid, reportCategory, reportType, generateReport, closeEventSource]);

Both functions are wrapped in useCallback so this should be safe.

web/pgadmin/llm/tools/database.py-260-264 (1)

260-264: Query manipulation could break certain SQL patterns.

The LIMIT injection logic wraps the original query as a subquery, but this approach can fail for queries containing ORDER BY without parentheses, CTEs, or UNION clauses. The 'LIMIT' not in query_upper check is also fragile—it could match LIMIT in a string literal or column name.

Consider a more robust approach or document the limitations:

         # Add LIMIT if not already present and query looks like SELECT
         query_upper = query.strip().upper()
-        if query_upper.startswith('SELECT') and 'LIMIT' not in query_upper:
-            query = f"({query}) AS llm_subquery LIMIT {max_rows + 1}"
-            query = f"SELECT * FROM {query}"
+        # Note: This simple wrapping may not work for all query patterns
+        # (CTEs, UNIONs, etc.). The LLM should ideally include LIMIT itself.
+        if (query_upper.startswith('SELECT') and 
+            'LIMIT' not in query_upper and
+            'UNION' not in query_upper and
+            'WITH' not in query_upper):
+            query = f"SELECT * FROM ({query}) AS llm_subquery LIMIT {max_rows + 1}"
web/pgadmin/llm/prompts/explain.py-12-12 (1)

12-12: Fix line length to comply with PEP 8.

Line 12 exceeds the 79-character limit (95 characters). This causes the pycodestyle E501 error flagged by the pipeline.

Apply this diff to fix the line length:

-EXPLAIN_ANALYSIS_PROMPT = """You are a PostgreSQL performance expert integrated into pgAdmin 4.
+EXPLAIN_ANALYSIS_PROMPT = """
+You are a PostgreSQL performance expert integrated into pgAdmin 4.
 Your task is to analyze EXPLAIN plan output and provide actionable optimization recommendations.

Based on pipeline failure logs.

web/migrations/versions/add_tools_ai_permission_.py-40-53 (1)

40-53: Potential issue with empty string permissions.

The code filters for permissions.isnot(None) but then checks if permissions:. If permissions is an empty string, it would pass the SQL filter but fail the Python truthiness check, skipping the update. However, the more concerning case is if there are roles with an empty string that slip through.

Consider this more defensive approach:

     for row in result:
         role_id = row[0]
         permissions = row[1]
-        if permissions:
+        if permissions and permissions.strip():
             perms_list = permissions.split(',')
             if 'tools_ai' not in perms_list:
                 perms_list.append('tools_ai')

Alternatively, you could add a filter to the SQL query:

     result = conn.execute(
         sa.select(role_table.c.id, role_table.c.permissions)
-        .where(role_table.c.permissions.isnot(None))
+        .where(sa.and_(
+            role_table.c.permissions.isnot(None),
+            role_table.c.permissions != ''
+        ))
     )
web/config.py-1018-1023 (1)

1018-1023: Fix line length to pass linting.

The pipeline failed due to line 1019 exceeding 79 characters. Split the comment across multiple lines:

 # Docker Model Runner Configuration
-# Docker Desktop 4.40+ includes a built-in model runner with an OpenAI-compatible
+# Docker Desktop 4.40+ includes a built-in model runner with an
+# OpenAI-compatible API. No API key is required.
-# API. No API key is required.
 # URL for the Docker Model Runner API endpoint. Leave empty to disable.
 # Default value: http://localhost:12434
 DOCKER_API_URL = ''
web/pgadmin/tools/sqleditor/__init__.py-3107-3109 (1)

3107-3109: Remove trailing blank line at end of file.

The pipeline fails due to a blank line at the end of the file (W391).

web/pgadmin/llm/static/js/ai_tools.js-399-403 (1)

399-403: Potential null reference if getDockerHandler returns undefined.

If getDockerHandler is not available or returns undefined, calling handler.focus() will throw an error.

       let handler = pgBrowser.getDockerHandler?.(
         BROWSER_PANELS.AI_REPORT_PREFIX,
         pgBrowser.docker.default_workspace
       );
+      if (!handler) {
+        pgBrowser.report_error(
+          gettext('Report'),
+          gettext('Cannot open report panel.')
+        );
+        return;
+      }
       handler.focus();
web/pgadmin/tools/sqleditor/__init__.py-3033-3044 (1)

3033-3044: Break long line to comply with PEP 8 line length limit.

Line 3044 exceeds 79 characters, causing the CI pipeline to fail.

-Provide your analysis identifying performance bottlenecks and optimization recommendations."""
+Provide your analysis identifying performance bottlenecks and \
+optimization recommendations."""
web/pgadmin/llm/providers/ollama.py-234-236 (1)

234-236: Remove redundant re import inside function.

The re module is already imported at the top of the file (line 13). This duplicate import is flagged by static analysis.

     def _parse_response(self, data: dict) -> LLMResponse:
         """Parse the Ollama API response into an LLMResponse."""
-        import re
-
         message = data.get('message', {})
web/pgadmin/tools/sqleditor/__init__.py-2742-2752 (1)

2742-2752: Fix pycodestyle violations to pass the CI pipeline.

The pipeline is failing due to style issues. Line 2747 requires 2 blank lines before the comment section, and the code block starting at line 2857 exceeds the 79-character limit.

Apply this diff to fix the blank line issue:

 return get_user_macros()
+

 # =============================================================================
 # Natural Language Query (NLQ) to SQL

Also, break the long regex pattern on line 2857 into multiple lines.

Committable suggestion skipped: line range outside the PR's diff.

web/pgadmin/llm/utils.py-225-257 (1)

225-257: Docstring mentions three providers but implementation supports four.

The docstring states Returns: The provider name ('anthropic', 'openai', 'ollama') or None but the code validates against {'anthropic', 'openai', 'ollama', 'docker'}. Update the docstring to include 'docker'.

     Returns:
-        The provider name ('anthropic', 'openai', 'ollama') or None if disabled.
+        The provider name ('anthropic', 'openai', 'ollama', 'docker') or None if disabled.
     """
web/pgadmin/llm/models.py-100-108 (1)

100-108: Use explicit Optional type annotation.

PEP 484 prohibits implicit Optional. The parameter should explicitly declare Optional[list[ToolCall]].

     @classmethod
-    def assistant(cls, content: str,
-                  tool_calls: list[ToolCall] = None) -> 'Message':
+    def assistant(cls, content: str,
+                  tool_calls: Optional[list[ToolCall]] = None) -> 'Message':
         """Create an assistant message."""
web/pgadmin/llm/__init__.py-205-208 (1)

205-208: Fix line too long (pipeline failure).

Line 207 exceeds 79 characters, causing the pipeline to fail.

-            help_str=gettext(
-                'The Ollama model to use. Models are loaded dynamically '
-                'from your Ollama server. You can also type a custom model name.'
-            ),
+            help_str=gettext(
+                'The Ollama model to use. Models are loaded dynamically '
+                'from your Ollama server. You can also type a custom '
+                'model name.'
+            ),
web/pgadmin/llm/__init__.py-1031-1034 (1)

1031-1034: Remove unused manager parameter.

The manager parameter is not used in the function body, and the server version is already available in security_info. Note: This function has no callers in the codebase, so no caller updates are needed.

-def _generate_security_report_llm(client, security_info, manager):
+def _generate_security_report_llm(client, security_info):
     """
     Use the LLM to analyze the security configuration and generate a report.
     """
🧹 Nitpick comments (47)
web/pgadmin/static/js/components/FormComponents.jsx (1)

921-922: Consider a more idiomatic pattern for tracking remounts (optional).

The Math.random() approach works correctly and achieves the goal of reloading options on remount. However, a more conventional React pattern would be to use a ref or callback ref to track mount state, or simply rely on the cleanup function behavior.

Alternative approach:

-  // Force options to reload on component remount (each mount gets a new ID)
-  const [mountId] = useState(() => Math.random());

Then in the useEffect cleanup, you could increment a ref counter if you need to track remounts explicitly, or simply rely on the existing optionsReloadBasis prop for controlled reloading.

That said, the current implementation is functional and the collision risk with Math.random() is negligible in practice.

web/pgadmin/llm/README.md (1)

25-25: Format the URL as code.

The bare URL should be enclosed in backticks for proper Markdown formatting.

Apply this diff:

-- `OLLAMA_API_URL`: URL for Ollama server (e.g., 'http://localhost:11434')
+- `OLLAMA_API_URL`: URL for Ollama server (e.g., `http://localhost:11434`)
web/pgadmin/tools/sqleditor/tests/test_explain_analyze_ai.py (1)

170-172: Empty setUp/tearDown methods should call super() or be removed.

These empty methods override parent behavior without providing value. Either call super() or remove them to inherit the default behavior.

 def tearDown(self):
-    pass
+    super().tearDown()

Or remove them entirely if no custom logic is needed.

Also applies to: 181-183, 198-199

web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx (1)

206-210: Handle clipboard API errors gracefully.

navigator.clipboard.writeText can throw if permissions are denied or in insecure contexts. Consider adding error handling.

-                  onClick={() => navigator.clipboard.writeText(message.sql)}
+                  onClick={() => {
+                    navigator.clipboard.writeText(message.sql).catch(() => {
+                      // Fallback or silent fail - clipboard may be unavailable
+                    });
+                  }}
web/pgadmin/llm/static/js/SecurityReport.jsx (2)

211-215: Unused onClose prop.

The onClose prop is destructured but aliased to _onClose and never used. Either implement or remove it.

 export default function SecurityReport({
   sid, did, scid, reportType = 'server',
   serverName, databaseName, schemaName,
-  onClose: _onClose
+  onClose
 }) {

Or remove from propTypes if not needed.


262-265: Missing dependencies in useEffect.

The useEffect references generateReport which is defined inside the component but not included in the dependency array. This works because generateReport is stable (no useCallback), but could cause stale closure issues.

Either wrap generateReport in useCallback and add it to dependencies, or use an inline function:

 useEffect(() => {
-  // Generate report on mount
-  generateReport();
+  // Generate report on mount/prop change
+  const doGenerate = () => {
+    // Move generateReport logic here or call it
+  };
+  doGenerate();
 }, [sid, did, scid, reportType]);
web/pgadmin/static/js/components/SelectRefresh.jsx (1)

46-51: Button label is context-specific.

The button title is hardcoded to 'Refresh models' which is specific to the LLM configuration use case. Since SelectRefresh is a generic component, consider making the tooltip configurable via props.

-function ChildContent({ cid, helpid, onRefreshClick, isRefreshing, ...props }) {
+function ChildContent({ cid, helpid, onRefreshClick, isRefreshing, refreshTitle, ...props }) {
   return (
     <StyledBox>
       ...
       <Box className="SelectRefresh-buttonContainer">
         <PgIconButton
           onClick={onRefreshClick}
           icon={<RefreshIcon />}
-          title={gettext('Refresh models')}
+          title={refreshTitle || gettext('Refresh')}
           disabled={isRefreshing}
         />
       </Box>
     </StyledBox>
   );
 }

Then pass refreshTitle from controlProps or as a default.

web/pgadmin/llm/static/js/AIReport.jsx (2)

370-375: Polling for theme changes every second is inefficient.

A 1-second interval polling getComputedStyle is wasteful. Consider using a MutationObserver on the body's class/style attributes, or simply run once on mount if theme changes are rare.

If theme changes at runtime are unlikely, remove the interval:

   useEffect(() => {
     const updateColors = () => {
       // ...
     };

     updateColors();
-
-    // Check periodically in case theme changes
-    const interval = setInterval(updateColors, 1000);
-    return () => clearInterval(interval);
   }, []);

Or use MutationObserver for efficient detection.


329-333: Unused onClose prop (same as SecurityReport).

The onClose prop is destructured but aliased to _onClose and never used.

web/pgadmin/llm/tools/database.py (4)

147-151: Remove unused variable readonly_wrapper.

The readonly_wrapper template string is defined but never used. The actual implementation below uses separate BEGIN TRANSACTION READ ONLY and ROLLBACK statements.

-    # Wrap the query in a read-only transaction
-    # This ensures even if the query tries to modify data, it will fail
-    readonly_wrapper = """
-    BEGIN TRANSACTION READ ONLY;
-    {query}
-    ROLLBACK;
-    """
-
     # For SELECT queries, we need to handle them differently

87-126: Unused manager parameter in _connect_readonly.

The manager parameter is never used within the function body. Either remove it from the signature or document why it's reserved for future use.

-def _connect_readonly(manager, conn, conn_id: str) -> tuple[bool, str]:
+def _connect_readonly(conn, conn_id: str) -> tuple[bool, str]:

If removed, update all call sites accordingly (lines 253, 306, 431, 539).


80-84: Chain exceptions using raise ... from e for better traceability.

When re-raising as a custom exception, preserve the original exception chain for debugging.

     except Exception as e:
         raise DatabaseToolError(
             f"Failed to get connection: {str(e)}",
             code="CONNECTION_ERROR"
-        )
+        ) from e

Apply the same pattern at lines 204-207.


200-207: Silent exception swallowing during rollback cleanup.

The bare except Exception: pass block hides rollback failures. While cleanup should not propagate errors, consider logging for observability.

         try:
             conn.execute_void("ROLLBACK")
-        except Exception:
-            pass
+        except Exception as rollback_err:
+            # Log but don't propagate - we're already handling an error
+            import logging
+            logging.getLogger(__name__).debug(
+                "Rollback failed during error handling: %s", rollback_err
+            )

Similar patterns exist at lines 282-285, 400-403, 507-510, 660-663.

web/pgadmin/llm/providers/__init__.py (1)

16-16: Consider sorting __all__ alphabetically.

Static analysis detected that the __all__ list is not sorted alphabetically. While this is a minor style issue, maintaining alphabetical order improves consistency and maintainability.

Apply this diff to sort the exports:

-__all__ = ['AnthropicClient', 'OpenAIClient', 'OllamaClient']
+__all__ = ['AnthropicClient', 'OllamaClient', 'OpenAIClient']

Based on static analysis hints (Ruff RUF022).

web/pgadmin/llm/prompts/__init__.py (1)

15-15: Consider sorting __all__ alphabetically.

The static analysis tool suggests alphabetical sorting of __all__ entries. While the current order is acceptable, sorting would improve consistency.

Apply this diff:

-__all__ = ['NLQ_SYSTEM_PROMPT', 'EXPLAIN_ANALYSIS_PROMPT']
+__all__ = ['EXPLAIN_ANALYSIS_PROMPT', 'NLQ_SYSTEM_PROMPT']
web/pgadmin/llm/tests/test_llm_status.py (1)

19-43: Consider annotating scenarios with ClassVar.

The static analysis tool suggests using typing.ClassVar for the mutable class attribute scenarios. This would improve type safety.

+from typing import ClassVar
+
 class LLMStatusTestCase(BaseTestGenerator):
     """Test cases for LLM status endpoint"""
 
-    scenarios = [
+    scenarios: ClassVar = [
         ('LLM Status - Disabled', dict(
web/pgadmin/llm/tests/test_report_endpoints.py (2)

154-189: Consider testing actual streaming content.

The streaming test verifies the response type but uses an empty generator. While this validates the SSE format, it doesn't test actual content streaming. Consider adding a test case with mock content:

mock_streaming.return_value = iter(["data: chunk1\n\n", "data: chunk2\n\n"])

Then verify the response contains the expected chunks.


192-233: The simulate_error flag is always True - consider simplifying.

The simulate_error parameter in scenarios is always True, making the conditional on line 223 unnecessary. Either remove the conditional or add a scenario where simulate_error=False to test the happy path within this test class.

-    scenarios = [
-        ('Report with API Error', dict(
-            simulate_error=True
-        )),
-    ]
+    scenarios = [
+        ('Report with API Error', dict()),
+    ]

And simplify the test:

-            if self.simulate_error:
-                mock_generate.side_effect = Exception("API connection failed")
+            mock_generate.side_effect = Exception("API connection failed")
web/pgadmin/preferences/static/js/components/PreferencesHelper.jsx (1)

105-143: Consider extracting the options loader to reduce duplication.

The options loading logic between optionsRefreshUrl (lines 122-143) and optionsUrl (lines 149-170) branches is nearly identical. Consider extracting a helper function:

const createOptionsLoader = (optionsEndpoint, staticOptions) => {
  return () => {
    return new Promise((resolve) => {
      const api = getApiInstance();
      const optionsUrl = url_for(optionsEndpoint);
      api.get(optionsUrl)
        .then((res) => {
          if (res.data?.data?.models) {
            resolve([...res.data.data.models, ...staticOptions]);
          } else {
            resolve(staticOptions);
          }
        })
        .catch(() => {
          resolve(staticOptions);
        });
    });
  };
};

Then use it in both branches:

element.options = createOptionsLoader(optionsEndpoint, staticOptions);
web/pgadmin/llm/reports/queries.py (1)

851-856: Unused context parameter - consider documenting intent.

The context parameter is declared but not used. If it's reserved for future scope filtering, consider prefixing with underscore to signal intentional non-use, or add a TODO comment:

 def execute_query(
     conn,
     query_id: str,
-    context: dict,
+    context: dict,  # Reserved for future scope-based filtering
     params: Optional[list] = None
 ) -> dict[str, Any]:

Or use the underscore prefix:

-    context: dict,
+    _context: dict,
web/pgadmin/llm/tools/__init__.py (1)

22-30: Consider sorting __all__ for consistency.

Static analysis suggests alphabetically sorting the __all__ list for better maintainability.

Apply this diff to sort the list:

 __all__ = [
+    'DATABASE_TOOLS',
+    'DatabaseToolError',
     'execute_readonly_query',
+    'execute_tool',
     'get_database_schema',
     'get_table_columns',
-    'get_table_info',
-    'execute_tool',
-    'DatabaseToolError',
-    'DATABASE_TOOLS'
+    'get_table_info'
 ]
web/pgadmin/llm/reports/__init__.py (1)

25-37: Consider sorting __all__ for consistency.

Static analysis suggests alphabetically sorting the __all__ list for better maintainability.

Apply this diff to sort the list:

 __all__ = [
-    'ReportPipeline',
-    'Section',
-    'SectionResult',
-    'Severity',
+    'DESIGN_SECTIONS',
+    'PERFORMANCE_SECTIONS',
     'SECURITY_SECTIONS',
-    'PERFORMANCE_SECTIONS',
-    'DESIGN_SECTIONS',
-    'get_sections_for_report',
+    'ReportPipeline',
+    'Section',
+    'SectionResult',
+    'Severity',
+    'execute_query',
+    'get_query',
     'get_sections_for_scope',
-    'get_query',
-    'execute_query',
+    'get_sections_for_report'
 ]
web/pgadmin/llm/providers/anthropic.py (2)

114-123: Preserve the exception chain for better debugging.

The generic exception handler should use raise ... from e to maintain the exception traceback for debugging purposes.

Apply this diff:

         except LLMClientError:
             raise
         except Exception as e:
             raise LLMClientError(LLMError(
                 message=f"Request failed: {str(e)}",
                 provider=self.provider_name
-            ))
+            )) from e

207-226: Preserve exception chains in error handlers.

Both exception handlers should use raise ... from e to maintain the exception traceback.

Apply this diff:

             raise LLMClientError(LLMError(
                 message=error_msg,
                 code=str(e.code),
                 provider=self.provider_name,
                 retryable=e.code in (429, 500, 502, 503, 504)
-            ))
+            )) from e
         except urllib.error.URLError as e:
             raise LLMClientError(LLMError(
                 message=f"Connection error: {e.reason}",
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from e
web/pgadmin/tools/sqleditor/__init__.py (5)

2793-2800: Prefix unused unpacked variable with underscore.

The static analysis correctly identifies that session_obj is unpacked but never used.

-    status, error_msg, conn, trans_obj, session_obj = \
+    status, error_msg, conn, trans_obj, _session_obj = \
         check_transaction_status(trans_id)

2913-2918: Use logging.exception for proper exception logging with traceback.

When logging exceptions in error handlers, use exception() instead of error() to automatically include the stack trace.

         except Exception as e:
-            current_app.logger.error(f'NLQ chat error: {str(e)}')
+            current_app.logger.exception('NLQ chat error: %s', e)
             yield _nlq_sse_event({
                 'type': 'error',
                 'message': str(e)
             })

2998-3005: Prefix unused unpacked variables with underscores.

The static analysis correctly identifies that conn, trans_obj, and session_obj are unpacked but never used in explain_analyze_stream.

-    status, error_msg, conn, trans_obj, session_obj = \
+    status, error_msg, _conn, _trans_obj, _session_obj = \
         check_transaction_status(trans_id)

3088-3093: Use logging.exception for proper exception logging with traceback.

Same issue as the NLQ chat error handler.

         except Exception as e:
-            current_app.logger.error(f'Explain analysis error: {str(e)}')
+            current_app.logger.exception('Explain analysis error: %s', e)
             yield _nlq_sse_event({
                 'type': 'error',
                 'message': str(e)
             })

2819-2836: Consider explicitly handling RuntimeError from chat_with_database for better error messaging.

The chat_with_database function raises RuntimeError when the LLM is not configured or if iterations exceed limits. While the broad Exception catch handles these cases, explicit RuntimeError handling would allow you to provide more specific error messages to users distinguishing configuration issues from iteration limit scenarios.

web/pgadmin/llm/client.py (1)

87-97: Remove unused variable and consider moving return to else block.

The response variable is assigned but never used. Additionally, the return statement could be cleaner.

     def validate_connection(self) -> tuple[bool, Optional[str]]:
         """
         Validate the connection to the LLM provider.

         Returns:
             Tuple of (success, error_message).
             If success is True, error_message is None.
         """
         try:
             # Try a minimal request to validate the connection
-            response = self.chat(
+            self.chat(
                 messages=[Message.user("Hello")],
                 max_tokens=10
             )
-            return True, None
         except LLMError as e:
             return False, str(e)
         except Exception as e:
-            return False, f"Connection failed: {str(e)}"
+            return False, f"Connection failed: {e!s}"
+        else:
+            return True, None
web/pgadmin/llm/providers/openai.py (3)

119-128: Preserve exception chain with raise ... from.

When re-raising exceptions, use raise ... from to preserve the exception chain for better debugging.

         try:
             response_data = self._make_request(payload)
             return self._parse_response(response_data)
         except LLMClientError:
             raise
         except Exception as e:
             raise LLMClientError(LLMError(
-                message=f"Request failed: {str(e)}",
+                message=f"Request failed: {e!s}",
                 provider=self.provider_name
-            ))
+            )) from e

213-240: Preserve exception chains in HTTP error handling.

All the exception handlers should use raise ... from to maintain the exception chain.

         except urllib.error.HTTPError as e:
             # ... error handling code ...
             raise LLMClientError(LLMError(
                 message=error_msg,
                 code=str(e.code),
                 provider=self.provider_name,
                 retryable=e.code in (429, 500, 502, 503, 504)
-            ))
+            )) from e
         except urllib.error.URLError as e:
             raise LLMClientError(LLMError(
                 message=f"Connection error: {e.reason}",
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from e
         except socket.timeout:
             raise LLMClientError(LLMError(
                 message="Request timed out. The request may be too large "
                         "or the server is slow to respond.",
                 code='timeout',
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from None

78-78: Consider documenting that **kwargs is reserved for future use.

The kwargs parameter is currently unused but may be intentionally reserved for provider-specific parameters. Consider adding a note in the docstring.

web/pgadmin/llm/static/js/ai_tools.js (1)

166-180: LLM status check uses fire-and-forget pattern without error feedback.

The checkLLMStatus function catches errors silently without logging. While the flags are correctly set to false, consider logging the error for debugging purposes.

         .catch(() => {
+          console.warn('Failed to check LLM status');
           this.llmEnabled = false;
           this.llmSystemEnabled = false;
           this.llmStatusChecked = true;
         });
web/pgadmin/llm/providers/ollama.py (2)

127-136: Preserve exception chain with raise ... from.

Same issue as the OpenAI client - use raise ... from to preserve the exception chain.

         try:
             response_data = self._make_request(payload)
             return self._parse_response(response_data)
         except LLMClientError:
             raise
         except Exception as e:
             raise LLMClientError(LLMError(
-                message=f"Request failed: {str(e)}",
+                message=f"Request failed: {e!s}",
                 provider=self.provider_name
-            ))
+            )) from e

221-232: Preserve exception chains in HTTP error handling.

Same pattern as OpenAI - add from e or from None to preserve exception chains.

             raise LLMClientError(LLMError(
                 message=error_msg,
                 code=str(e.code),
                 provider=self.provider_name,
                 retryable=e.code in (500, 502, 503, 504)
-            ))
+            )) from e
         except urllib.error.URLError as e:
             raise LLMClientError(LLMError(
                 message=f"Cannot connect to Ollama: {e.reason}",
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from e
web/pgadmin/llm/utils.py (1)

64-74: Consider logging exceptions for debugging purposes.

The broad except Exception: pass pattern silently swallows all errors. While returning None is appropriate for missing preferences, logging would help diagnose configuration issues.

+import logging
+
+logger = logging.getLogger(__name__)
+
 def _get_preference_value(name):
     ...
     try:
         pref_module = Preferences.module('ai')
         if pref_module:
             pref = pref_module.preference(name)
             if pref:
                 value = pref.get()
                 if value and str(value).strip():
                     return str(value).strip()
-    except Exception:
-        pass
+    except Exception as e:
+        logger.debug("Failed to read preference '%s': %s", name, e)
     return None
web/pgadmin/llm/providers/docker.py (3)

70-72: Availability check may be too permissive.

is_available() only checks if _api_url is set, but doesn't verify the model runner is actually reachable. Consider whether a connectivity check would be more reliable, or document that this only checks configuration.


125-131: Use exception chaining for better debugging.

When re-raising as LLMClientError, preserve the original exception context using raise ... from e.

         except LLMClientError:
             raise
         except Exception as e:
-            raise LLMClientError(LLMError(
+            raise LLMClientError(LLMError(
                 message=f"Request failed: {str(e)}",
                 provider=self.provider_name
-            ))
+            )) from e

197-247: Add exception chaining and consider URL scheme validation.

  1. Exception chaining (from e / from None) would preserve debug context.
  2. The URL is user-configurable; validating that it uses http:// or https:// would prevent unexpected behavior from file:// or other schemes.
+from urllib.parse import urlparse
+
 def _make_request(self, payload: dict) -> dict:
     """Make an HTTP request to the Docker Model Runner API."""
+    # Validate URL scheme
+    parsed = urlparse(self._api_url)
+    if parsed.scheme not in ('http', 'https'):
+        raise LLMClientError(LLMError(
+            message=f"Invalid URL scheme: {parsed.scheme}. Only http/https supported.",
+            provider=self.provider_name,
+            retryable=False
+        ))
+
     headers = {
         'Content-Type': 'application/json'
     }
     ...
         except urllib.error.HTTPError as e:
             ...
             raise LLMClientError(LLMError(
                 ...
-            ))
+            )) from None
         except urllib.error.URLError as e:
             raise LLMClientError(LLMError(
                 ...
-            ))
+            )) from e
         except socket.timeout:
             raise LLMClientError(LLMError(
                 ...
-            ))
+            )) from None
web/pgadmin/static/js/Explain/AIInsights.jsx (5)

54-60: Minor inconsistency: hardcoded pixel values vs theme spacing.

ContentArea and LoadingContainer use hardcoded pixel values (16px) while other styled components use theme.spacing(). Consider using theme.spacing(2) for consistency, though this is optional.

Also applies to: 125-132


366-381: Text color detection may not update on theme change.

The useEffect that extracts text colors from document.body runs only on mount (empty dependency array). If the user changes the theme dynamically, the colors won't update until the component remounts.

Consider adding a MutationObserver or listening to theme change events if dynamic theme switching is supported.


400-401: Consider awaiting fetchLlmInfo() before analysis.

fetchLlmInfo() is called without await, meaning the LLM info might not be updated before the analysis starts. If displaying accurate provider/model info during loading is important, consider awaiting it.

     // Fetch latest LLM provider/model info before running analysis
-    fetchLlmInfo();
+    await fetchLlmInfo();

557-559: Missing error handling for clipboard API.

navigator.clipboard.writeText() can fail (e.g., if the page isn't focused or lacks clipboard permissions). Consider adding error handling.

 const handleCopySQL = (sqlText) => {
-  navigator.clipboard.writeText(sqlText);
+  navigator.clipboard.writeText(sqlText).catch((err) => {
+    console.warn('Failed to copy to clipboard:', err);
+  });
 };

997-1003: Consider using stable keys instead of array index.

Using array index as React keys (key={idx}) can cause issues if items are reordered or removed. If bottlenecks/recommendations have unique identifiers from the API, use those instead.

Also applies to: 1017-1025

web/pgadmin/llm/__init__.py (2)

604-607: Consider exception chaining for better debugging.

Using raise ... from e preserves the original exception context, which helps with debugging.

     except urllib.error.HTTPError as e:
         if e.code == 401:
-            raise Exception('Invalid API key')
-        raise Exception(f'API error: {e.code}')
+            raise Exception('Invalid API key') from e
+        raise Exception(f'API error: {e.code}') from e

266-300: Consider DRY refactor for exposed URL endpoints.

The get_exposed_url_endpoints method has a long list that mirrors the route definitions. If routes are added/removed, this list must be manually kept in sync. Consider generating this list programmatically or using a decorator pattern.

Comment on lines +25 to +39
# Default system prompt for database assistant
DEFAULT_SYSTEM_PROMPT = """You are a PostgreSQL database assistant integrated into pgAdmin 4.
You have access to tools that allow you to query the database and inspect its schema.

When helping users:
1. First understand the database structure using get_database_schema or get_table_info
2. Write efficient SQL queries to answer questions about the data
3. Explain your findings clearly and concisely
4. If a query might return many rows, consider using LIMIT or aggregations

Important:
- All queries run in READ ONLY mode - you cannot modify data
- Results are limited to 1000 rows
- Always validate your understanding of the schema before writing complex queries
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix line length to pass pipeline checks.

Line 26 exceeds the 79-character limit (93 characters).

Apply this diff to fix the line length:

 # Default system prompt for database assistant
-DEFAULT_SYSTEM_PROMPT = """You are a PostgreSQL database assistant integrated into pgAdmin 4.
+DEFAULT_SYSTEM_PROMPT = """You are a PostgreSQL database assistant \
+integrated into pgAdmin 4.
 You have access to tools that allow you to query the database and inspect its schema.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Default system prompt for database assistant
DEFAULT_SYSTEM_PROMPT = """You are a PostgreSQL database assistant integrated into pgAdmin 4.
You have access to tools that allow you to query the database and inspect its schema.
When helping users:
1. First understand the database structure using get_database_schema or get_table_info
2. Write efficient SQL queries to answer questions about the data
3. Explain your findings clearly and concisely
4. If a query might return many rows, consider using LIMIT or aggregations
Important:
- All queries run in READ ONLY mode - you cannot modify data
- Results are limited to 1000 rows
- Always validate your understanding of the schema before writing complex queries
"""
# Default system prompt for database assistant
DEFAULT_SYSTEM_PROMPT = """You are a PostgreSQL database assistant \
integrated into pgAdmin 4.
You have access to tools that allow you to query the database and inspect its schema.
When helping users:
1. First understand the database structure using get_database_schema or get_table_info
2. Write efficient SQL queries to answer questions about the data
3. Explain your findings clearly and concisely
4. If a query might return many rows, consider using LIMIT or aggregations
Important:
- All queries run in READ ONLY mode - you cannot modify data
- Results are limited to 1000 rows
- Always validate your understanding of the schema before writing complex queries
"""
🧰 Tools
🪛 GitHub Actions: Check Python style

[error] 26-26: pycodestyle: E501 line too long (93 > 79 characters)

🤖 Prompt for AI Agents
In web/pgadmin/llm/chat.py around lines 25 to 39, the DEFAULT_SYSTEM_PROMPT
triple-quoted string contains a line (line 26) that exceeds the 79-character
limit; break that long sentence into shorter lines so each source line is <=79
chars (e.g., insert a newline in the prompt text or split the sentence across
multiple lines within the triple-quoted string) while preserving the exact
prompt content and spacing/line breaks as intended.

Comment on lines +50 to +63
required_ext = query_def.get('requires_extension')
if required_ext:
check_sql = f"""
SELECT EXISTS (
SELECT 1 FROM pg_extension WHERE extname = '{required_ext}'
) as available
"""
status, result = conn.execute_dict(check_sql)
if not (status and result and
result.get('rows', [{}])[0].get('available', False)):
return {
'note': f"Extension '{required_ext}' not installed",
'rows': []
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the current code and QUERIES registry
cd /tmp/test_repo 2>/dev/null || cd . || true

# First, look for the file
find . -name "generator.py" -path "*llm/reports*" 2>/dev/null | head -5

Repository: pgadmin-org/pgadmin4

Length of output: 102


🏁 Script executed:

#!/bin/bash
# Read the generator.py file, focusing on lines 40-70 to see the vulnerable code
sed -n '40,70p' ./web/pgadmin/llm/reports/generator.py | cat -n

Repository: pgadmin-org/pgadmin4

Length of output: 1418


🏁 Script executed:

#!/bin/bash
# Search for QUERIES registry and requires_extension definitions
grep -n "requires_extension\|QUERIES\s*=" ./web/pgadmin/llm/reports/generator.py | head -20

Repository: pgadmin-org/pgadmin4

Length of output: 125


🏁 Script executed:

#!/bin/bash
# Check the complete structure around the vulnerable code
head -100 ./web/pgadmin/llm/reports/generator.py | tail -60 | cat -n

Repository: pgadmin-org/pgadmin4

Length of output: 2416


🏁 Script executed:

#!/bin/bash
# Find QUERIES definition and requires_extension usages
rg -n "^QUERIES\s*=" ./web/pgadmin/llm/reports/generator.py
rg -n "requires_extension" ./web/pgadmin/llm/reports/

Repository: pgadmin-org/pgadmin4

Length of output: 446


🏁 Script executed:

#!/bin/bash
# Look at the beginning of generator.py to see QUERIES definition
head -50 ./web/pgadmin/llm/reports/generator.py

Repository: pgadmin-org/pgadmin4

Length of output: 1648


Fix SQL injection vulnerability by using parameterized queries.

The required_ext value is directly interpolated into SQL via f-string without parameterization. Although currently sourced from hardcoded entries in the QUERIES registry, this pattern violates psycopg2 best practices and is fragile. The same file already demonstrates the correct approach for similar scenarios (lines 28-29 with schema_id).

         if required_ext:
-            check_sql = f"""
-                SELECT EXISTS (
-                    SELECT 1 FROM pg_extension WHERE extname = '{required_ext}'
-                ) as available
-            """
-            status, result = conn.execute_dict(check_sql)
+            check_sql = """
+                SELECT EXISTS (
+                    SELECT 1 FROM pg_extension WHERE extname = %s
+                ) as available
+            """
+            status, result = conn.execute_dict(check_sql, [required_ext])
🧰 Tools
🪛 Ruff (0.14.8)

52-56: Possible SQL injection vector through string-based query construction

(S608)

🤖 Prompt for AI Agents
In web/pgadmin/llm/reports/generator.py around lines 50 to 63, the SQL query
interpolates required_ext directly into the SQL string creating an SQL injection
risk; replace the f-string SQL with a parameterized query (e.g. use a
placeholder like %s) and pass required_ext as a parameter to conn.execute_dict
so the DB driver safely binds the value, keeping the subsequent
status/result/rows checks unchanged.

Comment on lines +164 to +176
# Stage 4: Synthesis
yield {'type': 'stage', 'stage': 'synthesizing',
'message': 'Creating final report...'}

for retry_event in self._synthesize_with_retry(
section_results, context
):
if retry_event.get('type') == 'retry':
yield retry_event
elif retry_event.get('type') == 'result':
final_report = retry_event['result']

yield {'type': 'complete', 'report': final_report}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential UnboundLocalError if synthesis yields no result events.

If _synthesize_with_retry yields only retry events and no result event (e.g., all retries exhausted without reaching the result yield), final_report will be unbound when accessed on line 176.

         # Stage 4: Synthesis
         yield {'type': 'stage', 'stage': 'synthesizing',
                'message': 'Creating final report...'}

+        final_report = ''
         for retry_event in self._synthesize_with_retry(
             section_results, context
         ):
             if retry_event.get('type') == 'retry':
                 yield retry_event
             elif retry_event.get('type') == 'result':
                 final_report = retry_event['result']

         yield {'type': 'complete', 'report': final_report}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Stage 4: Synthesis
yield {'type': 'stage', 'stage': 'synthesizing',
'message': 'Creating final report...'}
for retry_event in self._synthesize_with_retry(
section_results, context
):
if retry_event.get('type') == 'retry':
yield retry_event
elif retry_event.get('type') == 'result':
final_report = retry_event['result']
yield {'type': 'complete', 'report': final_report}
# Stage 4: Synthesis
yield {'type': 'stage', 'stage': 'synthesizing',
'message': 'Creating final report...'}
final_report = ''
for retry_event in self._synthesize_with_retry(
section_results, context
):
if retry_event.get('type') == 'retry':
yield retry_event
elif retry_event.get('type') == 'result':
final_report = retry_event['result']
yield {'type': 'complete', 'report': final_report}
🤖 Prompt for AI Agents
In web/pgadmin/llm/reports/pipeline.py around lines 164 to 176, the loop over
self._synthesize_with_retry may never set final_report if only 'retry' events
are yielded, causing an UnboundLocalError when yielding the 'complete' event;
initialize final_report before the loop (e.g., final_report = None) and after
the loop check whether it was set—if None, either raise a clear exception or
yield a failure/complete event with an error message or empty/default report so
the caller gets a deterministic outcome instead of an unbound variable.

Comment on lines +877 to +892
# Check if query requires an extension
required_ext = query_def.get('requires_extension')
if required_ext:
# Check if extension is installed
check_sql = f"""
SELECT EXISTS (
SELECT 1 FROM pg_extension WHERE extname = '{required_ext}'
) as available
"""
status, result = conn.execute_dict(check_sql)
if not (status and result and
result.get('rows', [{}])[0].get('available', False)):
return {
'error': f"Extension '{required_ext}' not installed",
'rows': []
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

SQL injection vector in extension check - use parameterized query.

Although required_ext comes from the internal QUERIES registry, using string formatting in SQL (line 883) is a risky pattern that could become a vulnerability if the data source changes. Use a parameterized query instead:

     if required_ext:
         # Check if extension is installed
-        check_sql = f"""
-            SELECT EXISTS (
-                SELECT 1 FROM pg_extension WHERE extname = '{required_ext}'
-            ) as available
-        """
-        status, result = conn.execute_dict(check_sql)
+        check_sql = """
+            SELECT EXISTS (
+                SELECT 1 FROM pg_extension WHERE extname = %s
+            ) as available
+        """
+        status, result = conn.execute_dict(check_sql, [required_ext])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Check if query requires an extension
required_ext = query_def.get('requires_extension')
if required_ext:
# Check if extension is installed
check_sql = f"""
SELECT EXISTS (
SELECT 1 FROM pg_extension WHERE extname = '{required_ext}'
) as available
"""
status, result = conn.execute_dict(check_sql)
if not (status and result and
result.get('rows', [{}])[0].get('available', False)):
return {
'error': f"Extension '{required_ext}' not installed",
'rows': []
}
# Check if query requires an extension
required_ext = query_def.get('requires_extension')
if required_ext:
# Check if extension is installed
check_sql = """
SELECT EXISTS (
SELECT 1 FROM pg_extension WHERE extname = %s
) as available
"""
status, result = conn.execute_dict(check_sql, [required_ext])
if not (status and result and
result.get('rows', [{}])[0].get('available', False)):
return {
'error': f"Extension '{required_ext}' not installed",
'rows': []
}
🧰 Tools
🪛 Ruff (0.14.8)

881-885: Possible SQL injection vector through string-based query construction

(S608)

🤖 Prompt for AI Agents
In web/pgadmin/llm/reports/queries.py around lines 877-892 the code builds
check_sql using f-string interpolation of required_ext which is a SQL injection
risk; replace the f-string with a parameterized query and pass required_ext as a
bound parameter to conn.execute_dict (use the parameter placeholder style
supported by conn.execute_dict in this codebase), execute the parameterized
query, and keep the same result check and return value if the extension is not
available.

Comment on lines +733 to +748
{report && !loading && (
<Box className="AIReport-paper">
<Paper
elevation={0}
className="AIReport-markdown"
sx={(theme) => ({
color: `${theme.palette.text.primary} !important`,
'& *': {
color: 'inherit !important'
}
})}
>
<div dangerouslySetInnerHTML={{ __html: reportHtml }} />
</Paper>
</Box>
)}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

XSS risk with dangerouslySetInnerHTML - same as SecurityReport.jsx.

This component has the same pattern of rendering unsanitized Markdown HTML. Apply DOMPurify.sanitize() as recommended for SecurityReport.jsx.

+import DOMPurify from 'dompurify';

-const reportHtml = report ? marked.parse(getReportHeader() + report) : '';
+const reportHtml = report ? DOMPurify.sanitize(marked.parse(getReportHeader() + report)) : '';

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Biome (2.1.2)

[error] 745-745: Avoid passing content using the dangerouslySetInnerHTML prop.

Setting content using code can expose users to cross-site scripting (XSS) attacks

(lint/security/noDangerouslySetInnerHtml)

🤖 Prompt for AI Agents
In web/pgadmin/llm/static/js/AIReport.jsx around lines 733-748, the component
uses dangerouslySetInnerHTML with unsanitized reportHtml, creating an XSS risk;
import DOMPurify (e.g., import DOMPurify from 'dompurify'), sanitize the HTML
before rendering (const safeHtml = DOMPurify.sanitize(reportHtml || '')), and
use that sanitized string in dangerouslySetInnerHTML; ensure import is added at
top and handle null/undefined reportHtml by defaulting to an empty string.

Comment on lines +25 to +29
// Configure marked for security and rendering
marked.setOptions({
gfm: true, // GitHub Flavored Markdown
breaks: true, // Convert \n to <br>
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

**Consider sanitizing Markdown HTML output to prevent XSS.**The marked library documentation explicitly recommends using DOMPurify.sanitize(marked.parse(...)) for safe HTML output. While the LLM-generated content comes from a trusted backend, the Markdown could contain malicious payloads if the LLM is manipulated or if the content flows through untrusted channels.

Add DOMPurify to sanitize the HTML before rendering:

+import DOMPurify from 'dompurify';
 import { marked } from 'marked';

 // ... later in the component
-const reportHtml = report ? marked.parse(getReportHeader() + report) : '';
+const reportHtml = report ? DOMPurify.sanitize(marked.parse(getReportHeader() + report)) : '';

This also applies to AIReport.jsx (line 620, 745) which has the same pattern.

Also applies to: 319-319, 365-365

🤖 Prompt for AI Agents
In web/pgadmin/llm/static/js/SecurityReport.jsx around lines 25-29, the Markdown
HTML produced by marked.parse is not being sanitized which risks XSS; import
DOMPurify, run DOMPurify.sanitize(marked.parse(...)) and pass the sanitized HTML
to any dangerouslySetInnerHTML (or equivalent) renderer instead of raw marked
output; apply the same change in this file at lines 319 and 365 and in
web/pgadmin/llm/static/js/AIReport.jsx at lines 620 and 745 so all
marked-generated HTML is sanitized before rendering.

Comment on lines +494 to +510
try {
const response = await fetch(
url_for('sqleditor.nlq_chat_stream', {
trans_id: queryToolCtx.params.trans_id,
}),
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
message: userMessage,
conversation_id: conversationId,
}),
signal: controller.signal,
}
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing CSRF token in fetch request.

The fetch request to sqleditor.nlq_chat_stream does not include the CSRF token header. Looking at getApiInstance in the relevant code snippets, axios requests automatically include pgAdmin.csrf_token. However, this raw fetch call bypasses that.

Apply this diff to include the CSRF token:

       const response = await fetch(
         url_for('sqleditor.nlq_chat_stream', {
           trans_id: queryToolCtx.params.trans_id,
         }),
         {
           method: 'POST',
           headers: {
             'Content-Type': 'application/json',
+            [window.pgAdmin.csrf_token_header]: window.pgAdmin.csrf_token,
           },
           body: JSON.stringify({
             message: userMessage,
             conversation_id: conversationId,
           }),
           signal: controller.signal,
         }
       );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try {
const response = await fetch(
url_for('sqleditor.nlq_chat_stream', {
trans_id: queryToolCtx.params.trans_id,
}),
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
message: userMessage,
conversation_id: conversationId,
}),
signal: controller.signal,
}
);
try {
const response = await fetch(
url_for('sqleditor.nlq_chat_stream', {
trans_id: queryToolCtx.params.trans_id,
}),
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
[window.pgAdmin.csrf_token_header]: window.pgAdmin.csrf_token,
},
body: JSON.stringify({
message: userMessage,
conversation_id: conversationId,
}),
signal: controller.signal,
}
);
🤖 Prompt for AI Agents
In web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx
around lines 494-510, the fetch POST to sqleditor.nlq_chat_stream is missing the
CSRF token header; add the CSRF token from the global pgAdmin.csrf_token into
the request headers (e.g. include 'X-CSRF-TOKEN': pgAdmin.csrf_token or a safe
fallback) so the raw fetch mirrors the axios behavior and protects the endpoint.

Comment on lines +40 to +46
('NLQ Chat - Success', dict(
llm_enabled=True,
valid_transaction=True,
message='Find all users',
expected_error=False,
mock_response='{"sql": "SELECT * FROM users;", "explanation": "Gets all users"}'
)),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix line length to pass pipeline checks.

Line 45 exceeds the 79-character limit (92 characters).

Apply this diff to fix the line length:

         ('NLQ Chat - Success', dict(
             llm_enabled=True,
             valid_transaction=True,
             message='Find all users',
             expected_error=False,
-            mock_response='{"sql": "SELECT * FROM users;", "explanation": "Gets all users"}'
+            mock_response=(
+                '{"sql": "SELECT * FROM users;", '
+                '"explanation": "Gets all users"}'
+            )
         )),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
('NLQ Chat - Success', dict(
llm_enabled=True,
valid_transaction=True,
message='Find all users',
expected_error=False,
mock_response='{"sql": "SELECT * FROM users;", "explanation": "Gets all users"}'
)),
('NLQ Chat - Success', dict(
llm_enabled=True,
valid_transaction=True,
message='Find all users',
expected_error=False,
mock_response=(
'{"sql": "SELECT * FROM users;", '
'"explanation": "Gets all users"}'
)
)),
🧰 Tools
🪛 GitHub Actions: Check Python style

[error] 45-45: pycodestyle: E501 line too long (92 > 79 characters)

🤖 Prompt for AI Agents
In web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py around lines 40-46, line
45 exceeds the 79-character limit; break the long dict entry so no single line
is longer than 79 chars — move the mock_response key/value onto its own indented
line and split the JSON string into two concatenated string literals or use
implicit parentheses so the line length is under 79, keeping commas and
surrounding dict structure intact.

Comment on lines +77 to +80
mock_check_trans = patch(
'pgadmin.tools.sqleditor.check_transaction_status',
return_value=(True, None, mock_conn, mock_trans_obj, mock_session)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix line length to pass pipeline checks.

Line 79 exceeds the 79-character limit (82 characters).

Apply this diff to fix the line length:

             mock_check_trans = patch(
                 'pgadmin.tools.sqleditor.check_transaction_status',
-                return_value=(True, None, mock_conn, mock_trans_obj, mock_session)
+                return_value=(
+                    True, None, mock_conn, mock_trans_obj, mock_session
+                )
             )
🧰 Tools
🪛 GitHub Actions: Check Python style

[error] 79-79: pycodestyle: E501 line too long (82 > 79 characters)

🤖 Prompt for AI Agents
In web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py around lines 77 to 80, the
patched call to pgadmin.tools.sqleditor.check_transaction_status currently has a
line exceeding the 79-character limit; break the long line by splitting the
return_value tuple across multiple lines (or assign the tuple to a short
variable above and reference it) so each line stays <=79 chars, keeping the same
values (True, None, mock_conn, mock_trans_obj, mock_session) and preserving
indentation and patch context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant