Conversation
Signed-off-by: fortishield <161459699+FortiShield@users.noreply.github.com>
WalkthroughThis PR comprehensively updates the project infrastructure, including environment variable renaming across workflows and build scripts, a complete Makefile overhaul with new build targets and package management, addition of a Dockerfile for VNC/graphical environment, license update, extensive documentation rewrites and domain/link updates, and numerous SQL test scripts for multi-database scenarios. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
|||||||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Actionable comments posted: 11
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docker/examples/dashboard/test_case_sqlite_data.py (1)
118-120: Fix mismatched and non-numericuser_idassignment.
usersentries don’t include a numeric id, souser_id = random.choice(users)[0]assigns the name string, anduser_nameis picked from a different random user, breaking referential integrity. Pick a single user tuple and map fields consistently (e.g., name/email), or store IDs from inserts.💡 Proposed fix (consistent user selection)
- user_id = random.choice(users)[0] - user_name = random.choice(users)[1] + user = random.choice(users) + user_id = user[0] # If you intend to store name as ID, rename column or adjust schema + user_name = user[1]If
user_idis meant to reference the autoincrementuser.id, capture those IDs at insert time (e.g.,cursor.lastrowid) and store them in a separate list instead of using the name tuple.
🤖 Fix all issues with AI agents
In @.github/ISSUE_TEMPLATE/bug-report.yml:
- Around line 87-93: Update the four installation links in
.github/ISSUE_TEMPLATE/bug-report.yml so they point to valid, non-404
documentation pages; specifically replace the URLs used for "Installation From
Source", "Docker Installation", "Docker Compose Installation" and "Cluster
Installation" with the correct db-gpt.readthedocs.io paths (ensure Docker and
Docker Compose are different URLs), verify each URL opens successfully, and
update the corresponding label entries in the same file so the visible link text
and destination match the corrected docs.
In `@demo_selenium.py`:
- Around line 1-4: demo_selenium.py imports selenium (from selenium import
webdriver) but selenium is only declared in
packages/gptdb-sandbox/pyproject.toml; fix by either moving demo_selenium.py
into the packages/gptdb-sandbox/ folder so it lives with the declared
dependency, or add "selenium>=4.0.0" to the root pyproject.toml dependencies so
imports like webdriver and Options resolve when the root package is installed.
- Around line 6-9: Remove the deprecated options.headless assignment (Options
object defaults to non-headless) and, if you ever need headless in future, use
options.add_argument("-headless") to enable it; also move the driver
instantiation and any navigation/screenshot logic that currently runs at module
import into a main guard by wrapping the code that creates Options(), calls
webdriver.Firefox(...), and performs actions inside if __name__ == "__main__":
so imports no longer start the browser automatically.
In `@docker/compose_examples/conf/ha-model-cluster.toml`.:
- Line 6: Replace the hard-coded encrypt_key default ("your_secret_key") with a
required value sourced from an environment variable (e.g. ENCRYPT_KEY) and
update the config loading logic to fail fast if that env var is missing: remove
the predictable fallback for the encrypt_key key, read the key from process/env
in your config loader (or equivalent config parsing function), and throw/exit
with a clear error when ENCRYPT_KEY is not provided so deployments cannot
silently use a weak default.
In `@docker/compose_examples/conf/ha-webserver.toml`.:
- Line 6: The config currently hardcodes encrypt_key = "your_secret_key"; change
this to read the encryption key from an environment variable (e.g. ENCRYPT_KEY)
instead of a literal string, update any loader/initialization code that reads
encrypt_key to validate the env var and fail fast if it's missing, and ensure
the default placeholder is removed from the repo and documented as an env
requirement; reference the encrypt_key setting in the config and the code path
that loads it when making the change.
In `@docker/examples/my.cnf`.:
- Around line 21-37: The config sets
default-authentication-plugin=mysql_native_password which forces the weaker
legacy auth; change this to be opt-in by commenting out or removing the
default-authentication-plugin=mysql_native_password line so the server uses the
stronger default (caching_sha2_password) unless explicitly enabled for legacy
clients; update the configuration that contains the
default-authentication-plugin setting (and any duplicate occurrences) to be
commented out and add a short comment explaining it's opt-in for legacy clients.
In `@docker/examples/sqls/case_1_student_manager_sqlite.sql`.:
- Around line 15-22: The current PRIMARY KEY on the scores table (PRIMARY KEY
(student_id, course_id)) prevents storing multiple semesters for the same
student/course; update the scores table definition (table name: scores, columns:
student_id, course_id, semester) to include semester in the composite primary
key (e.g., PRIMARY KEY (student_id, course_id, semester)) or alternatively add a
surrogate id column and make that the PK while keeping student_id, course_id,
semester as a UNIQUE constraint to allow multiple semester rows without
conflicts.
In `@docker/examples/sqls/case_2_ecom_sqlite.sql`.:
- Around line 25-35: The seed data in the INSERT INTO users statement contains
real-looking emails; update the user_email values to use reserved example
domains (e.g., user@example.com or user@example.org) to avoid PII. Edit the
INSERT INTO users (... user_email ...) VALUES clause and replace each gmail
address (e.g., 'john@gmail.com', 'mary@gmail.com', etc.) with non-real addresses
(e.g., 'user1@example.com', 'user2@example.com') while keeping the rest of the
columns (user_id, user_name, registration_date, user_country) unchanged.
In `@docker/examples/sqls/test_case_info_sqlite.sql`.:
- Line 17: The SQL in the VALUES tuple uses MySQL's YEAR(orders.order_date)
which SQLite doesn't support; update the SELECT in that tuple to use SQLite's
strftime('%Y', orders.order_date) and compare to the string '2023' (e.g.
strftime('%Y', orders.order_date) = '2023') so the expected_sql in the tuple
(the one selecting products.product_name with JOIN orders ...) is
SQLite-compatible.
In `@docker/examples/sqls/user_config_mysql.sql`.:
- Around line 1-2: The SQL snippet updates mysql.user to set Host='%' for
User='root', which enables remote root access and is insecure; replace this
pattern by keeping root bound to localhost (do not change Host='%' for
User='root'), remove direct UPDATE mysql.user calls, and instead create a
dedicated application user via CREATE USER and GRANT with only the necessary
privileges for remote access, then FLUSH PRIVILEGES; locate the statements
UPDATE user SET Host='%' WHERE User='root' and replace them with the CREATE USER
/ GRANT flow and avoid modifying the root account.
In `@Dockerfile`:
- Around line 24-26: The Dockerfile currently hardcodes the VNC password using
the RUN line that calls x11vnc -storepasswd 123456 /root/.vnc/passwd; remove
that hardcoded invocation and instead create a runtime setup (entrypoint) that
reads a VNC password from an environment variable (e.g., VNC_PASSWORD) or a
secret file (e.g., /run/secrets/vnc_password), ensures /root/.vnc exists, stores
the password with x11vnc -storepasswd using the provided value, and tightens
file permissions on /root/.vnc/passwd; update the Dockerfile to copy the
entrypoint script and set it as ENTRYPOINT/CMD and keep a safe fallback behavior
(error or random password) when no password is provided.
🟡 Minor comments (13)
docker/examples/sqls/case_1_student_manager_sqlite.sql.-1-7 (1)
1-7: Enable FK enforcement in SQLite.SQLite does not enforce foreign keys unless
PRAGMA foreign_keys=ONis set. Add it at the top of the script to avoid silent integrity violations.🔧 Proposed fix
+PRAGMA foreign_keys = ON; + CREATE TABLE students ( student_id INTEGER PRIMARY KEY, student_name VARCHAR(100), major VARCHAR(100), year_of_enrollment INTEGER, student_age INTEGER );docker/examples/sqls/test_case.md.-79-87 (1)
79-87: Database compatibility issue:YEAR()function is MySQL-specific.The
YEAR(orders.order_date)function works in MySQL but not in SQLite or Vertica. Since this PR includes test data for multiple database engines (MySQL, SQLite, Vertica), this query won't work consistently across all targets.
- SQLite: Use
strftime('%Y', orders.order_date) = '2023'- Vertica: Use
EXTRACT(YEAR FROM orders.order_date) = 2023Consider either documenting this as MySQL-only or providing engine-specific variants.
docker/examples/sqls/case_3_order_wide_table_sqlite_wide.sql.-1-5 (1)
1-5: Missing PRIMARY KEY constraint.The
order_wide_tablelacks a PRIMARY KEY. For an order table,order_idshould typically be the primary key to ensure uniqueness and enable efficient lookups.Suggested fix
CREATE TABLE order_wide_table ( -- order_base - order_id TEXT, -- 订单ID + order_id TEXT PRIMARY KEY, -- 订单ID order_no TEXT, -- 订单编号docker/examples/sqls/case_2_ecom_sqlite.sql.-1-7 (1)
1-7: Enable SQLite foreign key enforcement.
SQLite ignores FK constraints unlessPRAGMA foreign_keys = ON;is set, which can silently allow invalidordersrows when this dataset is extended.✅ Proposed fix
+PRAGMA foreign_keys = ON; + CREATE TABLE users ( user_id INTEGER PRIMARY KEY, user_name VARCHAR(100), user_email VARCHAR(100),Also applies to: 15-23
docker/compose_examples/conf/ha-model-cluster.toml.-4-4 (1)
4-4: Clarify the log level configuration.
system.log_level = "INFO"conflicts with[log].level = "DEBUG". Align them or document precedence to avoid unexpected logging behavior.🛠️ Example alignment
-[log] -level = "DEBUG" +[log] +level = "INFO"Also applies to: 64-64
docker/compose_examples/conf/ha-model-cluster.toml.-13-21 (1)
13-21: Use consistent environment variable defaults in the TOML file.This file inconsistently handles environment variables:
GPTDB_LANG(line 3) andWORKER_TYPE(line 28) have defaults, butCONTROLLER_ADDRand theMYSQL_*database fields lack defaults. For consistency with other configuration files (e.g.,configs/gptdb-proxy-siliconflow-mysql.toml), add safe defaults like:controller_addr = "${env:CONTROLLER_ADDR:-http://localhost:8000}" host = "${env:MYSQL_HOST:-localhost}" port = "${env:MYSQL_PORT:-3306}" database = "${env:MYSQL_DATABASE:-gptdb}" user = "${env:MYSQL_USER:-root}" password = "${env:MYSQL_PASSWORD:-aa123456}"While the accompanying
ha-cluster-docker-compose.ymldoes provide these values, the TOML files should be self-contained for portability and clarity.Also applies to: lines 55-61
.pre-commit-config.yaml-1-1 (1)
1-1: Fix line endings to satisfy YAMLlint.YAMLlint reports CRLF line endings here; converting this file to LF will clear the lint error and avoid CI/pre-commit failures.
docs/docs/changelog/Released_V0.5.0.md-261-265 (1)
261-265: Fix markdownlint MD034: bare URLs in Appendix.Convert bare URLs to Markdown links to satisfy lint and improve readability.
✅ Suggested update
-- GPT-DB framework: https://github.com/khulnasoft -- Text2SQL fine tuning: https://github.com/khulnasoft-bot/GPT-DB-Hub -- GPT-DB-Web : https://github.com/khulnasoft-bot/GPT-DB-Web -- official English documentation: http://docs.gptdb.site/docs/overview -- official Chinese documentation: https://www.yuque.com/khulnasoft/gptdb-docs/bex30nsv60ru0fmx +- GPT-DB framework: [khulnasoft](https://github.com/khulnasoft) +- Text2SQL fine tuning: [GPT-DB-Hub](https://github.com/khulnasoft-bot/GPT-DB-Hub) +- GPT-DB-Web : [GPT-DB-Web](https://github.com/khulnasoft-bot/GPT-DB-Web) +- official English documentation: [Overview](http://docs.gptdb.site/docs/overview) +- official Chinese documentation: [概览](https://www.yuque.com/khulnasoft/gptdb-docs/bex30nsv60ru0fmx)docs/blog/2025-03-24-gptdb-v0.7.0-release.md-432-432 (1)
432-432: Address markdownlint: bare email in contributors list.The email triggers MD034; format it as a mailto link or remove it.
✅ Suggested update
-**283569391@qq.com, `@15089677014`, `@Aries-ckt`, `@FOkvj`, `@Jant1L`, `@SonglinLyu`, `@TenYearOldJAVA`, `@Weaxs`, `@cinjoseph`, `@csunny`, `@damonqin`, `@dusx1981`, `@fangyinc`, `@geebytes`, `@haawha`, `@utopia2077`, `@vnicers`, `@xuxl2024`, `@yhjun1026`, `@yunfeng1993`, `@yyhhyyyyyy` and tam** +**[283569391@qq.com](mailto:283569391@qq.com), `@15089677014`, `@Aries-ckt`, `@FOkvj`, `@Jant1L`, `@SonglinLyu`, `@TenYearOldJAVA`, `@Weaxs`, `@cinjoseph`, `@csunny`, `@damonqin`, `@dusx1981`, `@fangyinc`, `@geebytes`, `@haawha`, `@utopia2077`, `@vnicers`, `@xuxl2024`, `@yhjun1026`, `@yunfeng1993`, `@yyhhyyyyyy` and tam**docs/docs/gptdbs/introduction.md-3-3 (1)
3-3: Update repository links: khulnasoft-bot/gptdbs is not publicly accessible (404).The repository at
https://github.com/khulnasoft-bot/gptdbsreturns 404 Not Found. This affects the links on lines 3, 26-27, and 38. If this is an internal or renamed repository, verify the correct public path. Public references toawel-flow-simple-streaming-chatpoint toeosphoros-ai/dbgptsoreosphoros-ai/DB-GPTinstead.README.md-225-227 (1)
225-227: Fix the disclaimer link filename.
DISCKAIMER.mdlooks misspelled and will likely 404.📝 Suggested fix
-📖 **[Disclaimer](./DISCKAIMER.md)** - Please read our disclaimer for important information about the project. +📖 **[Disclaimer](./DISCLAIMER.md)** - Please read our disclaimer for important information about the project.README.md-150-160 (1)
150-160: Replace the replacement‑character headings.The
�glyphs will render incorrectly; swap to real emoji or plain text.📝 Suggested fix
-### � Privacy and Security +### Privacy and Security @@ -### � Supported Data Sources +### Supported Data SourcesREADME.md-74-76 (1)
74-76: Update "Latest Release" section to reflect the current version.The README claims "Latest Release: V0.7.0", but as of January 2026, the latest published DB-GPT release is v0.7.4 (October 24, 2025). Update the version number and release notes link accordingly.
🧹 Nitpick comments (16)
docker/examples/sqls/case_2_ecom_mysql.sql. (2)
1-2: Preferutf8mb4and make DB creation idempotent.Using MySQL
utf8limits characters, and re-running the script can fail if the DB already exists. Considerutf8mb4andIF NOT EXISTSfor a more robust example.♻️ Proposed change
-create database case_2_ecom character set utf8; +create database if not exists case_2_ecom + character set utf8mb4 + collate utf8mb4_0900_ai_ci; use case_2_ecom;
12-16: UseDECIMALfor prices to avoid rounding artifacts.
FLOATcan introduce precision errors in money values.DECIMAL(10,2)is the typical choice for currency.♻️ Proposed change
CREATE TABLE products ( product_id INT PRIMARY KEY, product_name VARCHAR(100) COMMENT '商品名称', - product_price FLOAT COMMENT '商品价格' + product_price DECIMAL(10,2) COMMENT '商品价格' ) COMMENT '商品信息表';docker/examples/sqls/test_case_info_sqlite.sql. (1)
14-14: Non-standardGROUP BYusage may cause issues in strict SQL environments.The
expected_sqlgroups bystudents.student_idbut selectsstudents.student_name, which is not in the GROUP BY clause. While SQLite is permissive and allows this, it's non-standard SQL and could produce unexpected results if a student_id maps to multiple names (though unlikely given it's a primary key).This is acceptable for test data but worth noting for consistency with stricter SQL modes.
docker/compose_examples/conf/ha-webserver.toml. (3)
13-13: Consider adding a default value or documenting thatCONTROLLER_ADDRis required.Unlike other environment variables in this file (e.g.,
GPTDB_LANG,EMBEDDING_MODEL_NAME),controller_addrhas no fallback default. IfCONTROLLER_ADDRis unset, this may cause unclear startup failures.💡 Option: Add a placeholder default or comment
-controller_addr = "${env:CONTROLLER_ADDR}" +# Required: Set CONTROLLER_ADDR environment variable +controller_addr = "${env:CONTROLLER_ADDR}"Or, if a sensible default exists:
-controller_addr = "${env:CONTROLLER_ADDR}" +controller_addr = "${env:CONTROLLER_ADDR:-http://controller:8000}"
21-21: Minor formatting inconsistency.Missing space before
=on this line, while other lines usekey = valueformat consistently.✏️ Suggested fix
-password ="${env:MYSQL_PASSWORD}" +password = "${env:MYSQL_PASSWORD}"
27-28: Redundant log level configuration.
log_levelis already set to"INFO"in the[system]section (line 4). Having it in both places could cause confusion about which takes precedence or lead to inconsistencies if only one is updated.docker/examples/sqls/test_case.md. (2)
33-45: Minor inconsistency: Missing "SQL:" label.Other queries have an explicit "SQL:" label before the code block (see lines 13, 25, 55), but Q3 is missing this label for consistency.
Suggested fix
查询哪些学生在 "2023年春季" 学期的课程学分总和超过2学分 +SQL: ```sql SELECT students.student_name
57-62: Potential strict SQL mode issue:GROUP BYcolumn mismatch.The query selects
users.user_namebut groups byusers.user_id. While this works in MySQL with default settings, it may fail in strict SQL mode (ONLY_FULL_GROUP_BY) or in other databases where all non-aggregated SELECT columns must appear in GROUP BY.The same pattern appears in Q2 (lines 68-73) with
products.product_namevsproducts.product_id.Suggested fix for stricter SQL compliance
-SELECT users.user_name, COUNT(orders.order_id) as order_count -FROM users -JOIN orders ON users.user_id = orders.user_id -GROUP BY users.user_id; +SELECT users.user_name, COUNT(orders.order_id) as order_count +FROM users +JOIN orders ON users.user_id = orders.user_id +GROUP BY users.user_id, users.user_name;demo_selenium.py (2)
15-15: Prefer explicit waits overtime.sleep().
time.sleep(5)is fragile—it either waits too long or not long enough. UseWebDriverWaitwith expected conditions for reliable synchronization.♻️ Suggested improvement
+from selenium.webdriver.support.ui import WebDriverWait +from selenium.webdriver.support import expected_conditions as EC +from selenium.webdriver.common.by import By ... - time.sleep(5) # পৃষ্ঠা লোড হওয়ার জন্য অপেক্ষা করুন + # Wait until the page title contains "Python" + WebDriverWait(driver, 10).until(EC.title_contains("Python"))
20-22: Hardcoded path assumes root user on Linux.
/root/screenshot.pngis not portable and requires root privileges. Consider using a relative path or a configurable location.♻️ Suggested improvement
- screenshot_path = "/root/screenshot.png" + screenshot_path = "screenshot.png" # Or use tempfile/pathlib for cross-platform supportdocker/examples/sqls/case_1_student_manager_mysql.sql. (1)
1-1: Useutf8mb4andIF NOT EXISTSfor broader Unicode support and repeatable setup.MySQL's
utf8is a deprecated alias forutf8mb3, which supports only BMP characters (U+0000..U+FFFF) and cannot store emoji or supplementary Unicode characters. Useutf8mb4(full UTF-8) instead. Additionally,IF NOT EXISTSmakes the script reusable without manual deletion on rerun.♻️ Recommended change
-create database case_1_student_manager character set utf8; +create database if not exists case_1_student_manager character set utf8mb4;docs/docs/modules/benchmark.md (1)
81-82: Prefer HTTPS for docs links if supported.
Switching to HTTPS avoids mixed-content warnings and improves security for readers. Please verify the endpoint supports HTTPS and update accordingly.🔧 Suggested update (if HTTPS is supported)
- - Method 1: Configure via configuration file. Reference: [ProxyModel Configuration](http://gpt.khulnasoft.com/docs/next/installation/advanced_usage/More_proxyllms) - - Method 2: Configure via product page. Reference: [Models](http://gpt.khulnasoft.com/docs/next/application/llms) + - Method 1: Configure via configuration file. Reference: [ProxyModel Configuration](https://gpt.khulnasoft.com/docs/next/installation/advanced_usage/More_proxyllms) + - Method 2: Configure via product page. Reference: [Models](https://gpt.khulnasoft.com/docs/next/application/llms)docs/docs/faq/kbqa.md (1)
56-56: Optional: tighten repetitive phrasing.
Consider rewording to avoid repeated “If you want to …” in adjacent sentences for readability.docs/docs/installation/model_service/cluster.md (1)
21-21: Optional: remove leading$in code blocks for copy-paste.
This avoids MD014 lint noise and makes commands directly runnable.Dockerfile (1)
21-23: Avoid masking install failures and clean apt lists.
|| truehides real errors and leaves apt cache in the layer. Consider gating the install behind a build arg and cleaning up.♻️ Suggested refactor
-# 安装 NVIDIA GPU 工具(如 nvidia-smi) -RUN apt-get update && apt-get install -y nvidia-utils-525 || true +ARG INSTALL_NVIDIA_UTILS=false +RUN if [ "$INSTALL_NVIDIA_UTILS" = "true" ]; then \ + apt-get update && \ + apt-get install -y --no-install-recommends nvidia-utils-525 && \ + rm -rf /var/lib/apt/lists/*; \ + fiMakefile (1)
1-33: Declare phony targets to avoid filename collisions.This prevents a file named like a target (e.g.,
setup) from blocking the rule.🧹 Suggested fix
.DEFAULT_GOAL := help +.PHONY: setup testenv fmt fmt-check test test-% all-test mypy \ + build build-% all-build docker-% all-docker publish publish-test publish-% \ + clean clean-dist help
| [Installation From Source](https://db-gpt.readthedocs.io/en/latest/getting_started/install/deploy/deploy.html) | ||
| - label: > | ||
| [Docker Installation](https://gpt-db.readthedocs.io/en/latest/getting_started/install/docker/docker.html) | ||
| [Docker Installation](https://db-gpt.readthedocs.io/en/latest/getting_started/install/docker/docker.html) | ||
| - label: > | ||
| [Docker Compose Installation](https://gpt-db.readthedocs.io/en/latest/getting_started/install/docker/docker.html) | ||
| [Docker Compose Installation](https://db-gpt.readthedocs.io/en/latest/getting_started/install/docker/docker.html) | ||
| - label: > | ||
| [Cluster Installation](https://gpt-db.readthedocs.io/en/latest/getting_started/install/llm/cluster/model_cluster.html) | ||
| [Cluster Installation](https://db-gpt.readthedocs.io/en/latest/getting_started/install/llm/cluster/model_cluster.html) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's examine the file to confirm the lines and content
echo "=== Checking file content at lines 87-93 ==="
sed -n '85,95p' .github/ISSUE_TEMPLATE/bug-report.yml
echo ""
# Check the URLs for accessibility
echo "=== Checking URL accessibility ==="
echo "1. Installation From Source URL..."
curl -s -o /dev/null -w "HTTP Status: %{http_code}\n" "https://db-gpt.readthedocs.io/en/latest/getting_started/install/deploy/deploy.html"
echo "2. Docker Installation URL..."
curl -s -o /dev/null -w "HTTP Status: %{http_code}\n" "https://db-gpt.readthedocs.io/en/latest/getting_started/install/docker/docker.html"
echo "3. Cluster Installation URL..."
curl -s -o /dev/null -w "HTTP Status: %{http_code}\n" "https://db-gpt.readthedocs.io/en/latest/getting_started/install/llm/cluster/model_cluster.html"Repository: khulnasoft/GPT-DB
Length of output: 930
All installation documentation links return 404 errors and must be corrected before merging.
The four URLs in the installation options are inaccessible:
- Installation From Source: 404
- Docker Installation: 404
- Docker Compose Installation: 404
- Cluster Installation: 404
Additionally, Docker Installation and Docker Compose Installation both point to the same URL (db-gpt.readthedocs.io/en/latest/getting_started/install/docker/docker.html), which needs to be corrected. Verify the correct documentation URLs for the db-gpt.readthedocs.io domain before this change can be merged.
🤖 Prompt for AI Agents
In @.github/ISSUE_TEMPLATE/bug-report.yml around lines 87 - 93, Update the four
installation links in .github/ISSUE_TEMPLATE/bug-report.yml so they point to
valid, non-404 documentation pages; specifically replace the URLs used for
"Installation From Source", "Docker Installation", "Docker Compose Installation"
and "Cluster Installation" with the correct db-gpt.readthedocs.io paths (ensure
Docker and Docker Compose are different URLs), verify each URL opens
successfully, and update the corresponding label entries in the same file so the
visible link text and destination match the corrected docs.
| import time | ||
|
|
||
| from selenium import webdriver | ||
| from selenium.webdriver.firefox.options import Options |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if pyproject.toml exists and search for selenium
echo "=== Checking for pyproject.toml ==="
if [ -f "pyproject.toml" ]; then
echo "pyproject.toml found. Searching for selenium..."
grep -i "selenium" pyproject.toml || echo "selenium not found in pyproject.toml"
else
echo "pyproject.toml not found in repo root"
fi
echo -e "\n=== Checking for requirements files ==="
find . -maxdepth 2 -type f \( -name "requirements*.txt" -o -name "setup.py" -o -name "setup.cfg" \) 2>/dev/null | while read file; do
echo "Found: $file"
grep -i "selenium" "$file" || echo " - selenium not found in $file"
done
echo -e "\n=== Checking if demo_selenium.py exists ==="
if [ -f "demo_selenium.py" ]; then
echo "demo_selenium.py found. First 10 lines:"
head -10 demo_selenium.py
else
echo "demo_selenium.py not found in repo root"
fiRepository: khulnasoft/GPT-DB
Length of output: 572
🏁 Script executed:
#!/bin/bash
# Read the pyproject.toml to see all dependency declarations
echo "=== Full pyproject.toml content ==="
cat pyproject.tomlRepository: khulnasoft/GPT-DB
Length of output: 2235
🏁 Script executed:
#!/bin/bash
# Find demo_selenium.py location
echo "=== Location of demo_selenium.py ==="
find . -name "demo_selenium.py" -type f
echo -e "\n=== Checking workspace members for pyproject.toml files with selenium ==="
for dir in packages/gptdb-app packages/gptdb-client packages/gptdb-core packages/gptdb-ext packages/gptdb-serve packages/gptdb-sandbox; do
if [ -d "$dir" ]; then
if [ -f "$dir/pyproject.toml" ]; then
echo "Checking $dir/pyproject.toml for selenium..."
grep -i "selenium" "$dir/pyproject.toml" && echo " Found in $dir" || echo " Not found in $dir"
fi
fi
doneRepository: khulnasoft/GPT-DB
Length of output: 788
🏁 Script executed:
#!/bin/bash
# Check gptdb-sandbox dependencies
echo "=== gptdb-sandbox/pyproject.toml dependencies section ==="
grep -A 20 "dependencies" packages/gptdb-sandbox/pyproject.toml | head -30
echo -e "\n=== Check if demo_selenium.py imports gptdb packages ==="
grep -E "^from gptdb|^import gptdb" ./demo_selenium.py || echo "No gptdb imports found"
echo -e "\n=== Full demo_selenium.py content ==="
cat ./demo_selenium.pyRepository: khulnasoft/GPT-DB
Length of output: 1594
demo_selenium.py at root level depends on selenium, which is only declared in gptdb-sandbox workspace member.
The selenium>=4.0.0 dependency is declared in packages/gptdb-sandbox/pyproject.toml, not in the root pyproject.toml. Since demo_selenium.py is located at the repository root and imports selenium directly, the dependency will not be available if the root package is installed standalone. Either move this script into packages/gptdb-sandbox/ where the dependency is declared, or add selenium to the root project dependencies.
🤖 Prompt for AI Agents
In `@demo_selenium.py` around lines 1 - 4, demo_selenium.py imports selenium (from
selenium import webdriver) but selenium is only declared in
packages/gptdb-sandbox/pyproject.toml; fix by either moving demo_selenium.py
into the packages/gptdb-sandbox/ folder so it lives with the declared
dependency, or add "selenium>=4.0.0" to the root pyproject.toml dependencies so
imports like webdriver and Options resolve when the root package is installed.
| options = Options() | ||
| options.headless = False # অবশ্যই False হতে হবে, যাতে VNC ডেস্কটপে ব্রাউজারের কার্যকলাপ দেখা যায় | ||
|
|
||
| driver = webdriver.Firefox(options=options) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
fd demo_selenium.pyRepository: khulnasoft/GPT-DB
Length of output: 77
🏁 Script executed:
cat -n demo_selenium.pyRepository: khulnasoft/GPT-DB
Length of output: 909
🌐 Web query:
Selenium 4 Firefox headless mode options.headless deprecated
💡 Result:
In Selenium 4.8.0, the convenience API (options.headless = True, options.setHeadless(true), etc.) was deprecated (and then removed in later Selenium versions) in favor of passing the headless mode via browser arguments. (selenium.dev)
Firefox (GeckoDriver): what to use now
Python
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
opts = Options()
opts.add_argument("-headless") # or "--headless"
driver = webdriver.Firefox(options=opts)(Using options.headless = True is what triggers the deprecation warning.) (selenium.dev)
Java
FirefoxOptions options = new FirefoxOptions();
options.addArguments("-headless"); // or "--headless"
WebDriver driver = new FirefoxDriver(options);Important note: --headless=new is Chromium-only
--headless=new is for Chromium’s “new headless” mode; Firefox doesn’t use --headless=new. (selenium.dev)
Citations:
- 1: https://www.selenium.dev/blog/2023/headless-is-going-away/?utm_source=openai
- 2: https://www.selenium.dev/blog/2023/headless-is-going-away/?utm_source=openai
- 3: https://stackoverflow.com/questions/76354308/what-is-the-updated-argument-for-firefox-browser-to-run-test-selenium-java-in?utm_source=openai
- 4: https://www.selenium.dev/blog/2023/headless-is-going-away/?utm_source=openai
Deprecated headless attribute and module-level execution.
-
options.headlessis deprecated in Selenium 4.8.0+. Since the code intends non-headless mode (visible browser), simply remove this line—non-headless is the default. If you explicitly need to set it in future versions, useoptions.add_argument("-headless")only to enable headless mode. -
Module-level code instantiation means the driver, navigation, and screenshot run on import. Wrap the entire execution logic in
if __name__ == "__main__":.
🤖 Prompt for AI Agents
In `@demo_selenium.py` around lines 6 - 9, Remove the deprecated options.headless
assignment (Options object defaults to non-headless) and, if you ever need
headless in future, use options.add_argument("-headless") to enable it; also
move the driver instantiation and any navigation/screenshot logic that currently
runs at module import into a main guard by wrapping the code that creates
Options(), calls webdriver.Firefox(...), and performs actions inside if __name__
== "__main__": so imports no longer start the browser automatically.
| language = "${env:GPTDB_LANG:-en}" | ||
| log_level = "INFO" | ||
| api_keys = [] | ||
| encrypt_key = "your_secret_key" |
There was a problem hiding this comment.
Avoid hard‑coded encryption key default.
"your_secret_key" is a predictable fallback and could lead to weak encryption in real deployments. Make this required via env (or fail fast).
🔒 Proposed fix
-encrypt_key = "your_secret_key"
+encrypt_key = "${env:ENCRYPT_KEY}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| encrypt_key = "your_secret_key" | |
| encrypt_key = "${env:ENCRYPT_KEY}" |
🤖 Prompt for AI Agents
In `@docker/compose_examples/conf/ha-model-cluster.toml`. at line 6, Replace the
hard-coded encrypt_key default ("your_secret_key") with a required value sourced
from an environment variable (e.g. ENCRYPT_KEY) and update the config loading
logic to fail fast if that env var is missing: remove the predictable fallback
for the encrypt_key key, read the key from process/env in your config loader (or
equivalent config parsing function), and throw/exit with a clear error when
ENCRYPT_KEY is not provided so deployments cannot silently use a weak default.
| language = "${env:GPTDB_LANG:-en}" | ||
| log_level = "INFO" | ||
| api_keys = [] | ||
| encrypt_key = "your_secret_key" |
There was a problem hiding this comment.
Hardcoded encryption key is a security risk.
The placeholder "your_secret_key" should be replaced with an environment variable reference to avoid accidental deployment with a weak, publicly-known key. Secrets should never be committed to version control.
🔒 Suggested fix
-encrypt_key = "your_secret_key"
+encrypt_key = "${env:GPTDB_ENCRYPT_KEY}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| encrypt_key = "your_secret_key" | |
| encrypt_key = "${env:GPTDB_ENCRYPT_KEY}" |
🤖 Prompt for AI Agents
In `@docker/compose_examples/conf/ha-webserver.toml`. at line 6, The config
currently hardcodes encrypt_key = "your_secret_key"; change this to read the
encryption key from an environment variable (e.g. ENCRYPT_KEY) instead of a
literal string, update any loader/initialization code that reads encrypt_key to
validate the env var and fail fast if it's missing, and ensure the default
placeholder is removed from the repo and documented as an env requirement;
reference the encrypt_key setting in the config and the code path that loads it
when making the change.
| CREATE TABLE scores ( | ||
| student_id INTEGER, | ||
| course_id INTEGER, | ||
| score INTEGER, | ||
| semester VARCHAR(50), | ||
| PRIMARY KEY (student_id, course_id), | ||
| FOREIGN KEY (student_id) REFERENCES students(student_id), | ||
| FOREIGN KEY (course_id) REFERENCES courses(course_id) |
There was a problem hiding this comment.
Primary key blocks multi-semester scores for the same student/course.
With semester present, (student_id, course_id) as the PK prevents storing multiple semesters for a repeated course. Consider including semester in the PK (or adding a surrogate key) to avoid insert conflicts later.
🐛 Proposed fix (include semester in the PK)
CREATE TABLE scores (
student_id INTEGER,
course_id INTEGER,
score INTEGER,
semester VARCHAR(50),
- PRIMARY KEY (student_id, course_id),
+ PRIMARY KEY (student_id, course_id, semester),
FOREIGN KEY (student_id) REFERENCES students(student_id),
FOREIGN KEY (course_id) REFERENCES courses(course_id)
);🤖 Prompt for AI Agents
In `@docker/examples/sqls/case_1_student_manager_sqlite.sql`. around lines 15 -
22, The current PRIMARY KEY on the scores table (PRIMARY KEY (student_id,
course_id)) prevents storing multiple semesters for the same student/course;
update the scores table definition (table name: scores, columns: student_id,
course_id, semester) to include semester in the composite primary key (e.g.,
PRIMARY KEY (student_id, course_id, semester)) or alternatively add a surrogate
id column and make that the PK while keeping student_id, course_id, semester as
a UNIQUE constraint to allow multiple semester rows without conflicts.
| INSERT INTO users (user_id, user_name, user_email, registration_date, user_country) VALUES | ||
| (1, 'John', 'john@gmail.com', '2020-01-01', 'USA'), | ||
| (2, 'Mary', 'mary@gmail.com', '2021-01-01', 'UK'), | ||
| (3, 'Bob', 'bob@gmail.com', '2020-01-01', 'USA'), | ||
| (4, 'Alice', 'alice@gmail.com', '2021-01-01', 'UK'), | ||
| (5, 'Charlie', 'charlie@gmail.com', '2020-01-01', 'USA'), | ||
| (6, 'David', 'david@gmail.com', '2021-01-01', 'UK'), | ||
| (7, 'Eve', 'eve@gmail.com', '2020-01-01', 'USA'), | ||
| (8, 'Frank', 'frank@gmail.com', '2021-01-01', 'UK'), | ||
| (9, 'Grace', 'grace@gmail.com', '2020-01-01', 'USA'), | ||
| (10, 'Helen', 'helen@gmail.com', '2021-01-01', 'UK'); |
There was a problem hiding this comment.
Avoid real-looking PII in seed emails.
These Gmail-style addresses can map to real people; prefer reserved example domains to eliminate privacy/compliance risk.
✅ Proposed fix
-INSERT INTO users (user_id, user_name, user_email, registration_date, user_country) VALUES
-(1, 'John', 'john@gmail.com', '2020-01-01', 'USA'),
-(2, 'Mary', 'mary@gmail.com', '2021-01-01', 'UK'),
-(3, 'Bob', 'bob@gmail.com', '2020-01-01', 'USA'),
-(4, 'Alice', 'alice@gmail.com', '2021-01-01', 'UK'),
-(5, 'Charlie', 'charlie@gmail.com', '2020-01-01', 'USA'),
-(6, 'David', 'david@gmail.com', '2021-01-01', 'UK'),
-(7, 'Eve', 'eve@gmail.com', '2020-01-01', 'USA'),
-(8, 'Frank', 'frank@gmail.com', '2021-01-01', 'UK'),
-(9, 'Grace', 'grace@gmail.com', '2020-01-01', 'USA'),
-(10, 'Helen', 'helen@gmail.com', '2021-01-01', 'UK');
+INSERT INTO users (user_id, user_name, user_email, registration_date, user_country) VALUES
+(1, 'John', 'john@example.com', '2020-01-01', 'USA'),
+(2, 'Mary', 'mary@example.com', '2021-01-01', 'UK'),
+(3, 'Bob', 'bob@example.com', '2020-01-01', 'USA'),
+(4, 'Alice', 'alice@example.com', '2021-01-01', 'UK'),
+(5, 'Charlie', 'charlie@example.com', '2020-01-01', 'USA'),
+(6, 'David', 'david@example.com', '2021-01-01', 'UK'),
+(7, 'Eve', 'eve@example.com', '2020-01-01', 'USA'),
+(8, 'Frank', 'frank@example.com', '2021-01-01', 'UK'),
+(9, 'Grace', 'grace@example.com', '2020-01-01', 'USA'),
+(10, 'Helen', 'helen@example.com', '2021-01-01', 'UK');🤖 Prompt for AI Agents
In `@docker/examples/sqls/case_2_ecom_sqlite.sql`. around lines 25 - 35, The seed
data in the INSERT INTO users statement contains real-looking emails; update the
user_email values to use reserved example domains (e.g., user@example.com or
user@example.org) to avoid PII. Edit the INSERT INTO users (... user_email ...)
VALUES clause and replace each gmail address (e.g., 'john@gmail.com',
'mary@gmail.com', etc.) with non-real addresses (e.g., 'user1@example.com',
'user2@example.com') while keeping the rest of the columns (user_id, user_name,
registration_date, user_country) unchanged.
| ('学校管理系统', '测试SQL助手的联合查询,条件查询和排序功能', '查询哪些学生在2023年秋季学期的课程学分总和超过15', 'SELECT students.student_name FROM students JOIN scores ON students.student_id = scores.student_id JOIN courses ON scores.course_id = courses.course_id WHERE scores.semester = ''2023年秋季'' GROUP BY students.student_id HAVING SUM(courses.credit) > 15;', '返回在2023年秋季学期的课程学分总和超过15的学生的姓名'), | ||
| ('电商系统', '测试SQL助手的数据聚合和分组功能', '查询每个用户的总订单数量', 'SELECT users.user_name, COUNT(orders.order_id) as order_count FROM users JOIN orders ON users.user_id = orders.user_id GROUP BY users.user_id;', '返回每个用户的总订单数量'), | ||
| ('电商系统', '测试SQL助手的数据聚合和分组功能', '查询每种商品的总销售额', 'SELECT products.product_name, SUM(products.product_price * orders.quantity) as total_sales FROM products JOIN orders ON products.product_id = orders.product_id GROUP BY products.product_id;', '返回每种商品的总销售额'), | ||
| ('电商系统', '测试SQL助手的数据聚合和分组功能', '查询2023年最受欢迎的商品(订单数量最多的商品)', 'SELECT products.product_name FROM products JOIN orders ON products.product_id = orders.product_id WHERE YEAR(orders.order_date) = 2023 GROUP BY products.product_id ORDER BY COUNT(orders.order_id) DESC LIMIT 1;', '返回2023年最受欢迎的商品(订单数量最多的商品)的名称'); |
There was a problem hiding this comment.
SQLite incompatibility: YEAR() function is not supported.
The expected_sql on this line uses YEAR(orders.order_date), which is a MySQL function and not valid in SQLite. SQLite requires strftime('%Y', orders.order_date) for extracting the year from a date.
Since this file is specifically for SQLite (test_case_info_sqlite.sql), the expected SQL should use SQLite-compatible syntax.
Proposed fix
-('电商系统', '测试SQL助手的数据聚合和分组功能', '查询2023年最受欢迎的商品(订单数量最多的商品)', 'SELECT products.product_name FROM products JOIN orders ON products.product_id = orders.product_id WHERE YEAR(orders.order_date) = 2023 GROUP BY products.product_id ORDER BY COUNT(orders.order_id) DESC LIMIT 1;', '返回2023年最受欢迎的商品(订单数量最多的商品)的名称');
+('电商系统', '测试SQL助手的数据聚合和分组功能', '查询2023年最受欢迎的商品(订单数量最多的商品)', 'SELECT products.product_name FROM products JOIN orders ON products.product_id = orders.product_id WHERE strftime(''%Y'', orders.order_date) = ''2023'' GROUP BY products.product_id ORDER BY COUNT(orders.order_id) DESC LIMIT 1;', '返回2023年最受欢迎的商品(订单数量最多的商品)的名称');📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ('电商系统', '测试SQL助手的数据聚合和分组功能', '查询2023年最受欢迎的商品(订单数量最多的商品)', 'SELECT products.product_name FROM products JOIN orders ON products.product_id = orders.product_id WHERE YEAR(orders.order_date) = 2023 GROUP BY products.product_id ORDER BY COUNT(orders.order_id) DESC LIMIT 1;', '返回2023年最受欢迎的商品(订单数量最多的商品)的名称'); | |
| ('电商系统', '测试SQL助手的数据聚合和分组功能', '查询2023年最受欢迎的商品(订单数量最多的商品)', 'SELECT products.product_name FROM products JOIN orders ON products.product_id = orders.product_id WHERE strftime(''%Y'', orders.order_date) = ''2023'' GROUP BY products.product_id ORDER BY COUNT(orders.order_id) DESC LIMIT 1;', '返回2023年最受欢迎的商品(订单数量最多的商品)的名称'); |
🤖 Prompt for AI Agents
In `@docker/examples/sqls/test_case_info_sqlite.sql`. at line 17, The SQL in the
VALUES tuple uses MySQL's YEAR(orders.order_date) which SQLite doesn't support;
update the SELECT in that tuple to use SQLite's strftime('%Y',
orders.order_date) and compare to the string '2023' (e.g. strftime('%Y',
orders.order_date) = '2023') so the expected_sql in the tuple (the one selecting
products.product_name with JOIN orders ...) is SQLite-compatible.
| USE mysql; | ||
| UPDATE user SET Host='%' WHERE User='root'; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and view its full content
find . -type f -name "user_config_mysql.sql" -o -path "*/docker/examples/sqls/user_config_mysql.sql"Repository: khulnasoft/GPT-DB
Length of output: 105
🏁 Script executed:
# Get a more direct check
cat docker/examples/sqls/user_config_mysql.sql 2>/dev/null || echo "File not found or error reading"Repository: khulnasoft/GPT-DB
Length of output: 133
🌐 Web query:
MySQL 8.0 best practices for changing user host, ALTER USER vs UPDATE mysql.user deprecated
💡 Result:
Best practice in MySQL 8.0: change the host with RENAME USER, not ALTER USER, and don’t UPDATE mysql.user
Key point: In MySQL, an “account” is the tuple 'user'@'host'. Changing the host means changing the account name, which you do with RENAME USER (or by creating a new account), not ALTER USER. RENAME USER can rename both the user and host parts, e.g. 'jeffrey'@'localhost' → 'jeffrey'@'127.0.0.1'. [2]
Recommended options
- Rename the account (best when you truly want to “move” it):
RENAME USER 'app'@'oldhost' TO 'app'@'newhost';This keeps the account’s privileges under the new name. Be aware of definer/orphan-object considerations noted in the manual. [2]
- Create a second account (best when you need both old and new host to work during a transition):
CREATE USER 'app'@'newhost' IDENTIFIED BY 'new_password';
-- then re-apply privileges:
SHOW GRANTS FOR 'app'@'oldhost';
-- run the shown GRANT statements but targeting 'app'@'newhost'(There isn’t a single “clone user” statement; you reapply grants.)
Why not UPDATE mysql.user?
Directly modifying grant tables with INSERT/UPDATE/DELETE is not recommended; changes won’t affect privilege checking until you reload privileges (for example via FLUSH PRIVILEGES), and it’s generally considered risky compared to account-management statements. [1] Also, some managed platforms explicitly disallow modifying mysql schema grant tables directly and require CREATE USER/GRANT/REVOKE etc. [3]
Summary:
- Change host: use
RENAME USER(or create a new'user'@'host'). [2] - Do not change host with
ALTER USER(it’s for account properties, not renaming). - Avoid
UPDATE mysql.user; use account-management statements instead. [1][3]
Sources: [1] [2] [3]
🌐 Web query:
MySQL Host='%' security implications remote root access
💡 Result:
In MySQL, the Host part of an account controls where that username may authenticate from. Setting root to Host='%' (that is, root@'%') means “root can log in from any host” (the % wildcard matches any host) and it is the least-specific host match. [1][2]
Security implications of root@'%'
- Network-reachable superuser: If MySQL is reachable over the network (intentionally or by mistake), you’ve exposed the most powerful account to the internet/LAN, making it a prime target for password spraying/brute force and credential reuse attacks. [1][3]
- Full compromise blast radius: Compromising
roottypically implies full control of data (read/modify/delete), privilege changes (create users, grant rights), and potentially host-level impact depending on enabled features/privileges and server hardening. - Misconfiguration amplification: Even if you “intend” MySQL to be private, real-world failures happen (cloud security group mistake, port-forward, VPN misroute).
root@'%'turns those into immediate high-severity incidents. Guidance and tooling commonly flag “root can login remotely” as a security finding. [4]
Recommended safer patterns
- Do not allow remote root: Keep
rootrestricted to local connections (for exampleroot@localhost) and administer remotely via SSH/VPN, or a bastion. [3][4] - Use a separate admin account, restricted by source IP/CIDR (instead of
%), and grant only what’s needed. MySQL supports IP/CIDR (or netmask) host specifications for accounts. [1][2] - Add network controls: Bind MySQL to a private interface, and firewall/security-group allowlist only trusted admin/application IPs. [3]
Quick check (find remote-capable root accounts):
SELECT user, host FROM mysql.user WHERE user='root';If you see root@'%' (or any non-local host), treat it as high risk and remove/lock it unless you have a very specific, well-defended need. [1][3]
Sources: MySQL account name/wildcard rules and meaning of % [1], MySQL connection matching and % being “any host”/least-specific [2], remote access hardening guidance incl. disabling remote root [3], MySQL security advisor item “Root Account Can Login Remotely” [4].
Avoid enabling remote root access by default.
Host='%' allows root login from any host. This is a significant security risk and is flagged by security scanners. Even in examples, this pattern should not be normalized. Additionally, direct UPDATE mysql.user is not recommended in modern MySQL; use CREATE USER and GRANT instead.
Keep root restricted to localhost and create a dedicated application user for remote access:
Safer alternative
USE mysql;
-UPDATE user SET Host='%' WHERE User='root';
+-- Keep root local; create a dedicated user for remote access
+CREATE USER IF NOT EXISTS 'app_user'@'%' IDENTIFIED BY 'change_me';
+GRANT ALL PRIVILEGES ON *.* TO 'app_user'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;🤖 Prompt for AI Agents
In `@docker/examples/sqls/user_config_mysql.sql`. around lines 1 - 2, The SQL
snippet updates mysql.user to set Host='%' for User='root', which enables remote
root access and is insecure; replace this pattern by keeping root bound to
localhost (do not change Host='%' for User='root'), remove direct UPDATE
mysql.user calls, and instead create a dedicated application user via CREATE
USER and GRANT with only the necessary privileges for remote access, then FLUSH
PRIVILEGES; locate the statements UPDATE user SET Host='%' WHERE User='root' and
replace them with the CREATE USER / GRANT flow and avoid modifying the root
account.
| # 设置 VNC 密码 | ||
| RUN mkdir -p /root/.vnc && \ | ||
| x11vnc -storepasswd 123456 /root/.vnc/passwd |
There was a problem hiding this comment.
Remove the hardcoded VNC password (security risk).
A fixed password in the image makes every container trivially accessible. Make the password runtime‑configurable (env or secret) and avoid baking it into the image.
🔒 Suggested direction (example)
-# 设置 VNC 密码
-RUN mkdir -p /root/.vnc && \
- x11vnc -storepasswd 123456 /root/.vnc/passwd
+ARG VNC_PASSWORD
+RUN test -n "$VNC_PASSWORD"
+RUN mkdir -p /root/.vnc && \
+ x11vnc -storepasswd "$VNC_PASSWORD" /root/.vnc/passwd📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # 设置 VNC 密码 | |
| RUN mkdir -p /root/.vnc && \ | |
| x11vnc -storepasswd 123456 /root/.vnc/passwd | |
| ARG VNC_PASSWORD | |
| RUN test -n "$VNC_PASSWORD" | |
| RUN mkdir -p /root/.vnc && \ | |
| x11vnc -storepasswd "$VNC_PASSWORD" /root/.vnc/passwd |
🤖 Prompt for AI Agents
In `@Dockerfile` around lines 24 - 26, The Dockerfile currently hardcodes the VNC
password using the RUN line that calls x11vnc -storepasswd 123456
/root/.vnc/passwd; remove that hardcoded invocation and instead create a runtime
setup (entrypoint) that reads a VNC password from an environment variable (e.g.,
VNC_PASSWORD) or a secret file (e.g., /run/secrets/vnc_password), ensures
/root/.vnc exists, stores the password with x11vnc -storepasswd using the
provided value, and tightens file permissions on /root/.vnc/passwd; update the
Dockerfile to copy the entrypoint script and set it as ENTRYPOINT/CMD and keep a
safe fallback behavior (error or random password) when no password is provided.
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@docs/blog/2025-03-24-gptdb-v0.7.0-release.md`:
- Line 68: Update the broken documentation links in the release blog post
(docs/blog/2025-03-24-gptdb-v0.7.0-release.md) to point to publicly accessible
URLs: either replace the unreachable http://gpt.khulnasoft.com/**...** links
(e.g. /docs/next/quickstart,
/docs/next/config-reference/app/config_chatdashboardconfig_2480d0,
/docs/next/config-reference/utils/config_s3storageconfig_f0cdc9,
/docs/next/installation/docker-build-guide, /docs/next/installation/docker) with
their live equivalents on the currently deployed docs site or temporarily revert
them to the previous working URLs; verify each updated link returns 200 OK
before merging and update all occurrences listed in the comment.
- Line 426: The link labeled "概览" currently points to
https://www.yuque.com/khulnasoft/gptdb-docs/bex30nsv60ru0fmx which appears
inaccessible/404; verify whether the correct public URL for the Chinese
documentation exists and is published, and then update the link target in
docs/blog/2025-03-24-gptdb-v0.7.0-release.md to the working URL (or replace with
an archived URL or mirror); if the doc requires authentication, either remove
the broken link or add a note beside "概览" indicating that authentication is
required and provide instructions or an alternative public resource.
In `@README.md`:
- Line 150: Replace the malformed character in the header string "### � Privacy
and Security" with the intended emoji or plain text (for example "### 🔒 Privacy
and Security" or "### Privacy and Security"), update the README.md header
accordingly, and ensure the file is saved in UTF-8 encoding so the emoji renders
correctly across systems.
- Around line 225-228: The README has a broken link due to a filename typo:
change the link target "DISCKAIMER.md" to the correct "DISCLAIMER.md" in the
Markdown link (the line containing the "Disclaimer" link) so the reference
points to the correct file name.
- Line 158: The header "### � Supported Data Sources" contains a malformed
Unicode character; update that header (the string "### � Supported Data
Sources") to use the intended character (e.g., replace "�" with a proper emoji
like "🔌" or "🔍", or remove it entirely so it reads "### Supported Data
Sources") and save the README.md with UTF-8 encoding to prevent future encoding
issues.
| **And all Chat scenarios in original GPT-DB now have deep thinking capabilities.** | ||
|
|
||
| For quick usage reference: [http://docs.gptdb.cn/docs/next/quickstart](http://docs.gptdb.cn/docs/next/quickstart) | ||
| For quick usage reference: [http://gpt.khulnasoft.com/docs/next/quickstart](http://gpt.khulnasoft.com/docs/next/quickstart) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's read the file to see all the URL changes
cat -n docs/blog/2025-03-24-gptdb-v0.7.0-release.md | sed -n '65,75p;160,170p;225,235p;325,335p;438,445p'Repository: khulnasoft/GPT-DB
Length of output: 2809
🏁 Script executed:
# Test the accessibility of the documented URLs
urls=(
"http://gpt.khulnasoft.com/docs/next/quickstart"
"http://gpt.khulnasoft.com/docs/next/config-reference/app/config_chatdashboardconfig_2480d0"
"http://gpt.khulnasoft.com/docs/next/config-reference/utils/config_s3storageconfig_f0cdc9"
"http://gpt.khulnasoft.com/docs/next/installation/docker-build-guide"
"http://gpt.khulnasoft.com/docs/next/installation/docker"
)
for url in "${urls[@]}"; do
http_code=$(curl -s -o /dev/null -w "%{http_code}" -L "$url" 2>&1)
echo "$url: HTTP $http_code"
doneRepository: khulnasoft/GPT-DB
Length of output: 461
All new documentation URLs are currently inaccessible—HTTP connection failures on all tested links.
The migrated documentation URLs across lines 68, 165, 167, 231, 328, 441–442 all fail to connect (HTTP 000). These include:
http://gpt.khulnasoft.com/docs/next/quickstarthttp://gpt.khulnasoft.com/docs/next/config-reference/app/config_chatdashboardconfig_2480d0http://gpt.khulnasoft.com/docs/next/config-reference/utils/config_s3storageconfig_f0cdc9http://gpt.khulnasoft.com/docs/next/installation/docker-build-guidehttp://gpt.khulnasoft.com/docs/next/installation/docker
Ensure the new documentation domain is fully deployed and publicly accessible before merging, or temporarily revert to working URLs if the new domain is not yet ready.
🤖 Prompt for AI Agents
In `@docs/blog/2025-03-24-gptdb-v0.7.0-release.md` at line 68, Update the broken
documentation links in the release blog post
(docs/blog/2025-03-24-gptdb-v0.7.0-release.md) to point to publicly accessible
URLs: either replace the unreachable http://gpt.khulnasoft.com/**...** links
(e.g. /docs/next/quickstart,
/docs/next/config-reference/app/config_chatdashboardconfig_2480d0,
/docs/next/config-reference/utils/config_s3storageconfig_f0cdc9,
/docs/next/installation/docker-build-guide, /docs/next/installation/docker) with
their live equivalents on the currently deployed docs site or temporarily revert
them to the previous working URLs; verify each updated link returns 200 OK
before merging and update all occurrences listed in the comment.
| **Chinese** | ||
|
|
||
| [概览](https://www.yuque.com/eosphoros/gptdb-docs/bex30nsv60ru0fmx) | ||
| [概览](https://www.yuque.com/khulnasoft/gptdb-docs/bex30nsv60ru0fmx) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check Chinese documentation URL accessibility
url="https://www.yuque.com/khulnasoft/gptdb-docs/bex30nsv60ru0fmx"
echo "Testing URL: $url"
# Use curl with timeout and follow redirects
response=$(curl -s -w "\n%{http_code}" -L --connect-timeout 5 --max-time 10 "$url" 2>&1)
http_code=$(echo "$response" | tail -n1)
body=$(echo "$response" | head -n-1)
echo "HTTP Status Code: $http_code"
# Check if response is valid
if [ "$http_code" -eq 200 ]; then
echo "Result: URL is accessible"
elif [ "$http_code" -eq 301 ] || [ "$http_code" -eq 302 ] || [ "$http_code" -eq 303 ] || [ "$http_code" -eq 307 ] || [ "$http_code" -eq 308 ]; then
echo "Result: URL redirects (Status: $http_code)"
elif [ "$http_code" -eq 404 ]; then
echo "Result: URL not found (404)"
elif [ "$http_code" -eq 403 ]; then
echo "Result: Access forbidden (403)"
else
echo "Result: Unexpected status code ($http_code)"
fiRepository: khulnasoft/GPT-DB
Length of output: 178
🌐 Web query:
Yuque documentation URL https://www.yuque.com/khulnasoft/gptdb-docs/bex30nsv60ru0fmx accessibility
💡 Result:
- Direct access from here failed: the specific doc URL timed out when fetched, so its public accessibility can’t be confirmed programmatically. [1]
- The parent repo URL
https://www.yuque.com/khulnasoft/gptdb-docsreturned 404 Not Found, which commonly indicates the knowledge base/repo slug isn’t publicly reachable (deleted/renamed) or is access-restricted. [2]
What to do to verify/access it
- Open the URL in a normal browser session. If prompted, sign in to Yuque and request permission from the repo owner.
- If you have access but need non-interactive retrieval, use Yuque’s API with a Personal Access Token (common requirement for SDK/API access). [3][4]
Sources: [1] [2] [3] [4]
The Chinese documentation URL appears to be inaccessible or broken.
The parent documentation repository at https://www.yuque.com/khulnasoft/gptdb-docs returns a 404 error, and the specific URL at line 426 times out when accessed. This indicates the documentation may be deleted, renamed, or access-restricted. Verify that the correct URL has been published and is publicly accessible, or confirm whether authentication is required.
🤖 Prompt for AI Agents
In `@docs/blog/2025-03-24-gptdb-v0.7.0-release.md` at line 426, The link labeled
"概览" currently points to
https://www.yuque.com/khulnasoft/gptdb-docs/bex30nsv60ru0fmx which appears
inaccessible/404; verify whether the correct public URL for the Chinese
documentation exists and is published, and then update the link target in
docs/blog/2025-03-24-gptdb-v0.7.0-release.md to the working URL (or replace with
an archived URL or mirror); if the doc requires authentication, either remove
the broken link or add a note beside "概览" indicating that authentication is
required and provide instructions or an alternative public resource.
| - **API Models:** Wenxin, Tongyi, Zhipu, and other commercial models | ||
|
|
||
| - **Fine-tuning Framework**: Model fine-tuning is an indispensable capability for any enterprise to implement in vertical and niche domains. GPT-DB provides a complete fine-tuning framework that integrates seamlessly with the GPT-DB project. In recent fine-tuning efforts, an accuracy rate based on the Spider dataset has been achieved at 82.5%. | ||
| ### � Privacy and Security |
There was a problem hiding this comment.
Fix malformed character (encoding issue).
Line 150 contains a malformed character "�" that appears to be a Unicode/emoji rendering issue. This should be replaced with the intended emoji or icon.
🔧 Suggested fix
-### � Privacy and Security
+### 🔒 Privacy and Security📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### � Privacy and Security | |
| ### 🔒 Privacy and Security |
🤖 Prompt for AI Agents
In `@README.md` at line 150, Replace the malformed character in the header string
"### � Privacy and Security" with the intended emoji or plain text (for example
"### 🔒 Privacy and Security" or "### Privacy and Security"), update the
README.md header accordingly, and ensure the file is saved in UTF-8 encoding so
the emoji renders correctly across systems.
| - **Secure Architecture** - End-to-end security framework | ||
|
|
||
| - **Data Sources**: Integrating various data sources to seamlessly connect production business data to the core capabilities of GPT-DB. | ||
| ### � Supported Data Sources |
There was a problem hiding this comment.
Fix malformed character (encoding issue).
Line 158 contains a malformed character "�" that appears to be a Unicode/emoji rendering issue. This should be replaced with the intended emoji or icon.
🔧 Suggested fix
-### � Supported Data Sources
+### 📊 Supported Data Sources📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### � Supported Data Sources | |
| ### 📊 Supported Data Sources |
🤖 Prompt for AI Agents
In `@README.md` at line 158, The header "### � Supported Data Sources" contains a
malformed Unicode character; update that header (the string "### � Supported
Data Sources") to use the intended character (e.g., replace "�" with a proper
emoji like "🔌" or "🔍", or remove it entirely so it reads "### Supported Data
Sources") and save the README.md with UTF-8 encoding to prevent future encoding
issues.
| ## ⚠️ Disclaimer | ||
|
|
||
| 📖 **[Disclaimer](./DISCKAIMER.md)** - Please read our disclaimer for important information about the project. | ||
|
|
There was a problem hiding this comment.
Fix typo in disclaimer filename.
Line 227 contains a typo: "DISCKAIMER.md" should be "DISCLAIMER.md". This will result in a broken link.
🔧 Proposed fix
-📖 **[Disclaimer](./DISCKAIMER.md)** - Please read our disclaimer for important information about the project.
+📖 **[Disclaimer](./DISCLAIMER.md)** - Please read our disclaimer for important information about the project.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## ⚠️ Disclaimer | |
| 📖 **[Disclaimer](./DISCKAIMER.md)** - Please read our disclaimer for important information about the project. | |
| ## ⚠️ Disclaimer | |
| 📖 **[Disclaimer](./DISCLAIMER.md)** - Please read our disclaimer for important information about the project. | |
🤖 Prompt for AI Agents
In `@README.md` around lines 225 - 228, The README has a broken link due to a
filename typo: change the link target "DISCKAIMER.md" to the correct
"DISCLAIMER.md" in the Markdown link (the line containing the "Disclaimer" link)
so the reference points to the correct file name.
- Fix is_exist method to return False on exception instead of True to avoid false positives - Fix string replace assignment in use-chat.ts to properly apply newline replacements - Remove inefficient database cleanup logic in benchmark_data_manager.py - Replace custom JSONPath implementation with jsonpath-ng library for robustness - Pass system_app context to get_benchmark_manager() calls - Refactor Neo4j delete_graph method to use single DETACH DELETE query for better performance - Update HTTP-Referer header URL to use correct khulnasoft-bot repository
User description
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Snapshots:
Include snapshots for easier review.
Checklist:
PR Type
Enhancement, Bug fix, Formatting
Description
Agent-based benchmarking support: Added comprehensive agent API evaluation framework with
BenchmarkAgentTask, response parsing strategies, and HTTP method configurationNeo4j graph database integration: Implemented Neo4j connector, store adapter, and community store adapter for graph operations and schema introspection
Benchmark data loading refactoring: Migrated from CSV to SQLite merge functionality with GitHub Falcon repository support and JSON/JSONL parsing
File parsing service enhancements: Added Excel and GitHub Falcon file parsers with standardized result models and Excel output support
Naming consistency updates: Renamed
DbGpts*classes toGptDbs*and updated repository references fromeosphorostokhulnasoftorganizationFull text search improvements: Enhanced Elasticsearch with metadata filtering and added full text search interface to base storage class
Environment variable resolution: Added support for resolving database passwords from environment variables in connector manager
API enhancements: Added
app_codeparameter to chat completion and HTTP request bodies for better app identificationCode quality improvements: Comprehensive import statement reorganization across 50+ files to follow PEP 8 style guidelines, improved error logging with traceback information, and null/boolean value normalization in data comparison
Documentation updates: Updated documentation URL references and package names in examples
Diagram Walkthrough
File Walkthrough
22 files
benchmark_data_manager.py
Refactor benchmark data loading from CSV to SQLite mergepackages/gptdb-serve/src/gptdb_serve/evaluate/service/fetchdata/benchmark_data_manager.py
parsing with SQLite database merging
FileLoadResult,SqlFileItem,GoldenSqlListResult) for unified file loading resultsload_file_from_github()method supporting JSON/JSONL fileparsing with multiple encoding support
_parse_json_file()and_parse_jsonl_file()helper methods forflexible JSON data parsing
improved error handling
async_after_start()withload_data()method and removed tablemapping functionality
neo4j_store_adapter.py
Add Neo4j Community Store Adapter implementationpackages/gptdb-ext/src/gptdb_ext/storage/knowledge_graph/community/neo4j_store_adapter.py
operations
operations on entities/relationships
exploration with depth/direction control
internal Vertex/Edge representations
falcon_file_parse.py
Add Falcon benchmark file parser servicepackages/gptdb-serve/src/gptdb_serve/evaluate/service/benchmark/ext/falcon_file_parse.py
environments
BenchmarkDataItem,ColumnItem,TableDDLItem,TableDataItem) for benchmark data structuresexecution in existing event loops
with SQL schema formatting
controller.py
Refactor agent chat routing and memory class namespackages/gptdb-serve/src/gptdb_serve/agent/agents/controller.py
db_gpts_memorytogpt_dbs_memorywith updatedclass names
agent_chat_v2()methodis_flow_chat()method to check if app uses flow-based chatapp_agent_flow_chat()methodapp_agent_chat()to conditionally route between flow andagent-based chat
file_parse_service.py
Refactor file parse service with Excel outputpackages/gptdb-serve/src/gptdb_serve/evaluate/service/benchmark/file_parse_service.py
parse_input_sets()abstract method requireevaluation_envparameter
@abstractmethoddecorator fromwrite_data_compare_result()andwrite_multi_round_benchmark_result()write_data_compare_result()to write benchmark results toExcel with auto-sizing columns
parse_standard_benchmark_sets()signature to includeevaluation_envparameterBenchmarkResultDaoandExcelUtilsconn_neo4j.py
Add Neo4j datasource connector implementationpackages/gptdb-ext/src/gptdb_ext/datasource/conn_neo4j.py
Neo4jParametersdataclass for connection configuration (host,port, user, password, database)
Neo4jConnectorclass with methods for graph operations,schema introspection, and Cypher query execution
support
connector_manager.py
Add Neo4j connector and environment variable resolutionpackages/gptdb-serve/src/gptdb_serve/datasource/manages/connector_manager.py
Neo4jConnectorinon_init()method_resolve_env_vars()test_connection()to resolve environment variables inpassword field
benchmark_service.py
Add agent-based benchmarking support with GitHub Falcon integrationpackages/gptdb-serve/src/gptdb_serve/evaluate/service/benchmark/benchmark_service.py
with new imports for
BenchmarkAgentTaskandStorageUtilapi_url,http_method,headers,parse_strategy,response_mapping,evaluation_env,benchmark_type)STANDARD_BENCHMARK_FILE_PATHfrom local Excel file to GitHubFalcon repository URL
evaluation environments with fallback defaults
based on
benchmark_typeconfigurationbenchmark_agent_task.py
New benchmark agent task module for remote API evaluationpackages/gptdb-serve/src/gptdb_serve/evaluate/service/benchmark/task/benchmark_agent_task.py
BenchmarkAgentTaskclass for evaluating remoteagent APIs
ResponseParserclass with JSON path and direct parsingstrategies
response extraction
elasticsearch.py
Enhance Elasticsearch full text search with metadata filteringpackages/gptdb-ext/src/gptdb_ext/storage/full_text/elasticsearch.py
keywordtoobjectwithdynamic: Truefor better queryability
is_support_full_text_search()andfull_text_search()methods tosupport full text search
_build_query()and_build_metadata_filter()methods formetadata filtering with Elasticsearch bool queries
_normalize_metadata()utility to ensure consistent metadatahandling across dict, string, and JSON formats
load_document()and search methods to use normalized metadatainstead of JSON serialization
models.py
Add agent configuration models and evaluation environment enumspackages/gptdb-serve/src/gptdb_serve/evaluate/service/benchmark/models.py
EvaluationEnv(DEV, TEST),BenchmarkInvokeType(LLM,AGENT),
HttpMethod,ResponseParseStrategyGITHUBfile parse type toFileParseTypeEnumAgentApiConfigdataclass with API endpoint, HTTP method,headers, response parsing configuration, and retry settings
AgentCompletionRequestdataclass for agent API requestsBenchmarkExecuteConfigwithevaluation_env,invoke_type, andagent_configfieldsneo4j_store.py
Implement Neo4j graph store with resource registrationpackages/gptdb-ext/src/gptdb_ext/storage/graph_store/neo4j_store.py
@register_resourcedecorator with configuration parameters forNeo4j host, port, username, password, and database
Neo4jStoreConfigdataclass with connection details andgraph element type configurations
Neo4jStoreclass implementation with connection initializationusing
Neo4jConnectorget_config()andis_exist()methods for configurationretrieval and database existence checking
excel_file_parse.py
New Excel file parsing service for benchmark data extractionpackages/gptdb-serve/src/gptdb_serve/evaluate/service/benchmark/ext/excel_file_parse.py
ExcelFileParseServicefor parsing Excelbenchmark files
parse_input_sets()method to extract benchmark data fromExcel worksheets
parse_standard_benchmark_sets()for parsing standard answersets with comparison strategies
_parse_multi_standard_result()utility for handling multiplestandard results from JSON strings
user_input_execute_service.py
Refactor file service to support multiple file type parserspackages/gptdb-serve/src/gptdb_serve/evaluate/service/benchmark/user_input_execute_service.py
file_typeparameter instead offile_serviceinstancefile_service()factory method to instantiate appropriate parserbased on file type (Excel or GitHub Falcon)
read_input_file()to acceptevaluation_envparameter and passit to parser
post_dispatch()to passevaluation_envto standard benchmarkparsing
BenchmarkResultDaoinitialization to class levelbase.py
Add full text search interface to base classpackages/gptdb-core/src/gptdb/storage/full_text/base.py
is_support_full_text_search()method returningTruefor fulltext stores
full_text_search()method as default implementation delegatingto
similar_search_with_scores()aload_document()signature to include optionalfile_idparameter for document-level tracking
schemas.py
Extend benchmark request schema with agent configuration fieldspackages/gptdb-serve/src/gptdb_serve/evaluate/api/schemas.py
datasetsfield type fromList[dict]toUnion[str, List[dict]]to support string values
BenchmarkServeRequest:evaluation_env,benchmark_type,api_url,http_method,headers,parse_strategy,response_mappingendpoints.py
Update benchmark endpoint to support agent configurationpackages/gptdb-serve/src/gptdb_serve/evaluate/api/endpoints.py
_run_benchmark_task_sync()to pass all parameters as keywordarguments
run_dataset_benchmark()call:benchmark_type,api_url,http_method,headers,parse_strategy,response_mapping,evaluation_envfactory.py
Add Neo4j store adapter factory supportpackages/gptdb-ext/src/gptdb_ext/storage/knowledge_graph/community/factory.py
Neo4jStoreandNeo4jStoreAdapterNeo4jStoreAdapterforNeo4jStoreinstances
api_v1.py
Add app_code parameter to chat completion requestspackages/gptdb-app/src/gptdb_app/openapi/api_v1/api_v1.py
app_codeparameter to dialogue context in chat completionendpoint
app_codeto bothdialogue.select_paramand streaming responsecalls
http_trigger.py
Add app_code field to HTTP request bodypackages/gptdb-core/src/gptdb/core/awel/trigger/http_trigger.py
app_codefield toCommonLLMHttpRequestBodyfor appidentification
StorageUtil.py
Add GitHub Falcon file type detectionpackages/gptdb-core/src/gptdb/util/benchmarks/StorageUtil.py
GITHUB_FALCON_PREFIXconstant for GitHub Falcon repository URLFileParseTypeEnum.GITHUBfor Falconrepository URLs
opensearch.py
Add file_id parameter to async document loadingpackages/gptdb-ext/src/gptdb_ext/storage/full_text/opensearch.py
file_idparameter toaload_documentmethod signature52 files
gptdbs.py
Rename DbGpts classes to GptDbs for consistencypackages/gptdb-serve/src/gptdb_serve/agent/agents/gptdbs.py
DbGptsMessagetoGptDbsMessagefor naming consistencyDbGptsTaskSteptoGptDbsTaskStepDbGptsCompletiontoGptDbsCompletionconvention
pack.py
Rename tool identifier constant for consistencypackages/gptdb-core/src/gptdb/agent/resource/tool/pack.py
DB_GPT_TOOL_IDENTIFIERtoGPT_DB_TOOL_IDENTIFIERforconsistency
_is_function_tool()functioncli_scripts.py
Reorganize gptdbs CLI imports for claritypackages/gptdb-core/src/gptdb/cli/cli_scripts.py
add_repo,install,list_all_apps, andother commands
serve_initialization.py
Update naming and simplify imports in serve initializationpackages/gptdb-app/src/gptdb_app/initialization/serve_initialization.py
DbGpts RegistertoGptDbs Registerforconsistency
PromptServesimple_nl_schema_sql_chart_example.py
Fix import order and update package name in exampleexamples/awel/simple_nl_schema_sql_chart_example.py
gpt-db[openai]todb-gpt[openai]app_agent_manage.py
Update memory class names for consistencypackages/gptdb-serve/src/gptdb_serve/agent/agents/app_agent_manage.py
db_gpts_memorytogpt_dbs_memoryMetaDbGptsPlansMemorytoMetaGptDbsPlansMemoryMetaDbGptsMessageMemorytoMetaGptDbsMessageMemoryreact_agent_example.py
Fix import order in React agent exampleexamples/agents/react_agent_example.py
typing_extensionsafter local importssimple_intent_translator.py
Normalize entity naming in intent translator examplepackages/gptdb-core/src/gptdb/rag/transformer/simple_intent_translator.py
GPTDBtogptdbin example prompt forconsistency
gpts_app.py
Fix import order in gpts_app modulepackages/gptdb-serve/src/gptdb_serve/agent/db/gpts_app.py
vllm_llm.py
Fix import order and update environment variable namingpackages/gptdb-core/src/gptdb/model/llm/llm_out/vllm_llm.py
DB_GPT_MODEL_BENCHMARKtoGPT_DB_MODEL_BENCHMARKsandbox_code_agent_example.py
Fix import order in sandbox code agent exampleexamples/agents/sandbox_code_agent_example.py
api_v2.py
Fix import order in API v2 modulepackages/gptdb-app/src/gptdb_app/openapi/api_v2.py
oceanbase_store.py
Fix import order in OceanBase store modulepackages/gptdb-ext/src/gptdb_ext/storage/vector_store/oceanbase_store.py
serve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/flow/serve.py
sqlalchemyimport after internal package importsi18n_utils.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/util/i18n_utils.py
gptdb.configs.model_configimport before try-except blockrecommend_question.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/agent/app/recommend_question/recommend_question.py
sqlalchemyimports after internal package importsstorage_adapter.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/storage/chat_history/storage_adapter.py
sqlalchemy.orm.Sessionimport after internal package importsmedia_utils.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/model/utils/media_utils.py
serve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/file/serve.py
sqlalchemyimport after internal package importsconv_links.py
Import statement reorganization for code stylepackages/gptdb-app/src/gptdb_app/openapi/api_v1/links/conv_links.py
gptdb.storage.metadataimport before third-party imports__init__.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/model/init.py
AutoLLMClientimport before try-except blocktugraph_store_adapter.py
Import statement reorganization for code stylepackages/gptdb-ext/src/gptdb_ext/storage/knowledge_graph/community/tugraph_store_adapter.py
packaging.version.Versionimport after internal package importsmy_plugin_db.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/agent/hub/db/my_plugin_db.py
sqlalchemyimports after internal package importscustom_tool_agent_example.py
Import statement reorganization for code styleexamples/agents/custom_tool_agent_example.py
typing_extensionsimports after internal package importsmodels.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/file/models/models.py
sqlalchemyimports after internal package importsmodels.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/gptdbs/my/models/models.py
sqlalchemyimports after internal package importsmodels.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/model/models/models.py
sqlalchemyimports after internal package importsserve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/utils/_template_files/default_serve_template/serve.py
sqlalchemyimport after internal package importsmodels.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/feedback/models/models.py
sqlalchemyimports after internal package importsbase.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/storage/knowledge_graph/base.py
pydantic.Fieldimport after internal package importsagent_memory.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/agent/core/memory/agent_memory.py
typing_extensions.TypedDictimport after internal packageimports
models.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/prompt/models/models.py
sqlalchemyimports after internal package importsbase.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/util/tracer/base.py
starlette.requests.Requestimport after internal package importsdisk_storage.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/storage/cache/storage/disk/disk_storage.py
rocksdictimports after internal package importsmodels.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/utils/_template_files/default_serve_template/models/models.py
sqlalchemyimports after internal package importssettings.py
Import statement reorganization for code stylepackages/gptdb-app/src/gptdb_app/openapi/api_v1/links/settings.py
gptdb.storage.metadataimport before third-party importsgpts_plans_db.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/agent/db/gpts_plans_db.py
sqlalchemyimports after internal package importsbenchmark_db.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/evaluate/db/benchmark_db.py
gptdb.storage.metadataimport before third-party importsserve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/gptdbs/hub/serve.py
sqlalchemyimport after internal package importsext_http_trigger.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/core/awel/trigger/ext_http_trigger.py
starlette.requests.Requestimport after internal package importsserve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/gptdbs/my/serve.py
sqlalchemyimport after internal package importsmodels.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/agent/chat/models/models.py
sqlalchemyimports after internal package importsserve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/agent/chat/serve.py
sqlalchemyimport after internal package importsserve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/datasource/serve.py
sqlalchemyimport after internal package importsserve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/evaluate/serve.py
sqlalchemyimport after internal package importsserve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/feedback/serve.py
sqlalchemyimport after internal package importsmodels.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/libro/models/models.py
sqlalchemyimports after internal package importsserve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/libro/serve.py
sqlalchemyimport after internal package importsserve.py
Import statement reorganization for code stylepackages/gptdb-serve/src/gptdb_serve/rag/serve.py
sqlalchemyimport after internal package importshf_chat_llm.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/model/llm/llm_out/hf_chat_llm.py
transformersimports after internal package importschat_history_db.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/storage/chat_history/chat_history_db.py
gptdb.storage.metadataimport before third-party importsserver.py
Import statement reorganization for code stylepackages/gptdb-core/src/gptdb/util/code/server.py
lyricimports after internal package imports2 files
repo.py
Update default gptdbs repository to khulnasoft organizationpackages/gptdb-core/src/gptdb/util/gptdbs/repo.py
eosphoros/gptdbstokhulnasoft/gptdbseosphorostokhulnasoft-botin GitHublinks
naming
model.py
Update plugin hub URL and fix import orderpackages/gptdb-serve/src/gptdb_serve/agent/model.py
khulnasoft/GPT-DB-Plugins.gittokhulnasoft-bot/GPT-DB-Plugins.git2 files
demo_selenium.py
Add Selenium WebDriver demo scriptdemo_selenium.py
screenshot capture
model_adapter.py
Rename environment variable for consistencypackages/gptdb-core/src/gptdb/model/adapter/model_adapter.py
DB_GPT_MODEL_BENCHMARKtoGPT_DB_MODEL_BENCHMARK1 files
service.py
Improve error logging with traceback informationpackages/gptdb-serve/src/gptdb_serve/rag/service/service.py
debugging
1 files
data_compare_service.py
Add null and boolean value normalization in data comparisonpackages/gptdb-serve/src/gptdb_serve/evaluate/service/benchmark/data_compare_service.py
nullto empty stringTrue/Falseto1/0fordatabase consistency
1 files
simple_rag_summary_example.py
Update documentation URL referenceexamples/awel/simple_rag_summary_example.py
docs.gptdb.cntogpt.khulnasoft.com101 files
Summary by CodeRabbit
Release Notes
Chores
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.