doc: update storage scale guidance to <4TB & validate url in ci#448
Merged
doc: update storage scale guidance to <4TB & validate url in ci#448
Conversation
Restructure and clarify repository documentation. AGENTS.md was rewritten and condensed into a practical agent/dev guide with development commands, prerequisites (Hugo Extended, Node.js v16+), content/structure overview, CI/CD notes, and troubleshooting tips. README.md was replaced with a bilingual (ZH/EN) homepage including a 3-step quickstart, repo layout, common commands, contributing requirements, contact info, and license. contribution.md was expanded with a PR checklist and clearer contribution steps (fork/branch/PR with screenshots). These changes improve onboarding and contribution workflows for the docs site.
Bump documentation site version and update docs/configs to reflect HugeGraph 1.7.0 changes: update config.toml version to 1.7; add Version Change notices for Auth REST API in CN/EN; revise HugeGraph-Server quickstart (docker images, download/toolchain URLs) and add deprecation warnings for removed legacy backends (MySQL, PostgreSQL, Cassandra, ScyllaDB) in favor of 1.7 backends (RocksDB, HStore, HBase, Memory). Update default rest-server config docs: increase batch.max_* defaults, raise batch.max_write_ratio, set exception.allow_trace true, add log.slow_query_threshold, and add K8s / PD/Meta / Arthas configuration option sections.
Add HugeGraph-Spark-Connector quick start docs (English and Chinese). Add a Configuration section to HugeGraph-Hubble docs (server settings and Gremlin query limits) in both languages. Update hugegraph-tools docs (EN/CN) to document new graph commands (graph-create, graph-clone, graph-drop), authentication backup/restore (auth-backup, auth-restore), --thread-num option for relevant commands, and minor heading/usage adjustments.
Add comprehensive HugeGraph-AI documentation and update quickstart content. New files added (Chinese & English): config-reference.md (full configuration reference), hugegraph-ml.md (HugeGraph-ML overview, algorithms and examples), and rest-api.md (REST API reference including RAG and Text2Gremlin endpoints). Updated pages: _index.md (feature list and v1.5.0 highlights such as Text2Gremlin, multi-model vectors, bilingual prompts, LiteLLM support, enhanced rerankers), hugegraph-llm.md (LLM provider/LiteLLM, reranker and Text2Gremlin usage), and quick_start.md (language switching / bilingual prompt guide). Also tightened environment requirements (Python 3.10+, uv 0.7+, HugeGraph 1.5+) and updated ML algorithm count/details to 21. These changes expand docs for deploying and integrating LLM, Text2Gremlin, ML workflows and REST APIs.
Rewrite and restructure the Computer configuration documentation in both Chinese and English. The changes replace the old flat option tables with organized sections (Basics, Algorithm, Input, Snapshot/Storage, Worker/Master, I/O/Output, Network/Transport, Storage, BSP, Performance, System-managed, K8s Operator, KubeDriver and CRD). Added clear default-value semantics, examples for local and MinIO snapshots, notes about system-managed options (do not modify), and more explanatory text for many options. Also updated related CRD/KubeDriver fields formatting and clarified operator environment variable mapping. Minor related updates applied to the quickstart computing hugegraph-computer page.
Change repository links from incubator to apache/hugegraph and apache/hugegraph-doc in config and many docs; update Chinese homepage and docs with refreshed wording (add AI/LLM, features blocks, revised introductions and CTAs); adjust multiple changelog weights and fix numerous internal/repo links and examples to reflect project rename and content improvements.
Add description frontmatter to many REST API docs (Chinese and English) to provide concise summaries for each API page. Update quickstart index pages (computing, hugegraph‑ai, toolchain) to recommend DeepWiki with direct links and add GitHub access links. Revise CN FAQ backend storage guidance to describe single-node (RocksDB) vs distributed (HStore) deployment, recommend selection by scale, and note deprecation of Cassandra/HBase/MySQL in later versions.
Revise configuration documentation for both CN and EN sites: update page titles and linkTitle fields for config guide and options, enrich the config index pages with Server configuration overview and links (including auth/HTTPS), and move/rename the Computer config page from config/ to quickstart/computing/ with updated title, linkTitle and weight. Also adjust wording about distributed deployment scale (changed from "100GB+" to "< 1000TB") in FAQ and introduction pages for both languages.
Rename config guide and reference titles (English & Chinese) to "Server Startup Guide" and "Server Complete Configuration Manual"; reorganize config-option pages by converting optional/config blocks (K8s, Arthas, RPC server, HBase, Cassandra/ScyllaDB, MySQL/PostgreSQL) into collapsible <details> sections and group deprecated backends under a "≤ 1.5 Version Config (Legacy)" section. Also adjust index bullets ordering/labels. These changes improve readability and clearly mark legacy backend configs.
Adjust documentation recommendations for standalone deployments: change small/medium-scale threshold from <1TB to <4TB across Chinese and English docs. Also update the distributed mode scale wording in index pages (previously >= 1000TB to < 1000TB). Files updated: content/{cn,en}/docs/_index.md, content/{cn,en}/docs/guides/faq.md, content/{cn,en}/docs/introduction/README.md to keep translations consistent with the new guidance.
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the documentation’s stated data-scale guidance for deployment modes, aligning standalone RocksDB recommendations to a higher threshold and reflecting the change across EN/CN pages.
Changes:
- Update Standalone (RocksDB) recommended data scale from
< 1TBto< 4TBin Introduction and FAQ (EN/CN). - Update the documentation landing page deployment-mode table data-scale values (EN/CN).
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| content/en/docs/introduction/README.md | Updates Standalone mode scale guidance to < 4TB. |
| content/en/docs/guides/faq.md | Updates Standalone mode scale guidance to < 4TB in the storage-selection FAQ. |
| content/en/docs/_index.md | Updates the deployment-mode table to reflect new data-scale guidance. |
| content/cn/docs/introduction/README.md | 同步更新单机模式数据规模说明为 < 4TB。 |
| content/cn/docs/guides/faq.md | 同步更新 FAQ 中单机模式数据规模说明为 < 4TB。 |
| content/cn/docs/_index.md | 同步更新首页部署模式表格中的数据规模说明。 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Pengzna
previously approved these changes
Feb 3, 2026
Add a new link validator script (dist/validate-links.sh) that scans markdown files for internal /docs/ and /cn/docs/ links and verifies target pages exist. Integrate the checker into the Hugo GitHub Actions workflow (.github/workflows/hugo.yml) so links are validated during CI. Fix several documentation links in both English and Chinese introduction pages to the correct /docs/quickstart/hugegraph/... path. Remove the obsolete DISCLAIMER file and tweak dist/README wording. Also add a TODO note in dist/validate-release.sh regarding TLP graduation and ASF infra migration.
Fix various broken/incorrect documentation links and improve link validation. Updated CN and EN docs to point to the new quickstart path (/docs/quickstart/hugegraph/hugegraph-server) and corrected several other internal links (computer config, loader, performance pages, release notes, ToplingDB reference). Minor wording/table fixes in quickstart and introduction pages. Enhance dist/validate-links.sh to strip trailing slashes from links (and made the script executable) so the validator correctly matches target files.
Fix a typo in the performance docs by renaming the api-preformance directory to api-performance for both Chinese and English docs. Updated filenames (e.g. hugegraph-api-0.5.6-rocksdb/cassandra) and adjusted links in content/*/docs/SUMMARY.md and the performance index pages to point to the new paths. No substantive content changes—only file renames and link updates.
Pengzna
approved these changes
Feb 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
As title