Skip to content

Conversation

@ti-chi-bot
Copy link
Member

This is an automated cherry-pick of #21078

First-time contributors' checklist

What is changed, added or deleted? (Required)

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions.

  • master (the latest development version)
  • v9.0 (TiDB 9.0 versions)
  • v8.5 (TiDB 8.5 versions)
  • v8.4 (TiDB 8.4 versions)
  • v8.3 (TiDB 8.3 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)

What is the related PR or file link(s)?

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot ti-chi-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm size/M Denotes a PR that changes 30-99 lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR. type/compatibility-or-feature-change This PR involves compatibility changes or feature behavior changes. v9.0-beta.2 This PR/issue applies to TiDB v9.0-beta.2. labels Dec 18, 2025
@ti-chi-bot
Copy link
Member Author

@Leavrth This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 18, 2025

@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 18, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign lance6716 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist
Copy link

Summary of Changes

Hello @ti-chi-bot, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request updates the documentation for TiDB's Backup & Restore (BR) tool, enhancing clarity around Point-in-Time Recovery (PITR) ID map management and introducing new features for checkpoint data storage and system table restoration. The changes provide users with more detailed guidance on configuring and understanding BR's behavior, particularly for newer versions, and introduce a more efficient method for restoring system tables.

Highlights

  • PITR ID Map Handling: The documentation for mysql.tidb_pitr_id_map has been updated to clarify its persistence, noting that from v9.0.0, if the table doesn't exist and --checkpoint-storage is not specified, the pitr_id_map data will be written to the log backup directory.
  • External Checkpoint Storage: New documentation has been added detailing the --checkpoint-storage parameter, which allows specifying an external storage location for checkpoint data, along with a description of the directory structure used for these checkpoints.
  • Physical System Table Restoration: The --fast-load-sys-tables parameter has been introduced (from v9.0.0), enabling physical restoration of system tables using RENAME TABLE DDL, which atomically swaps temporary system tables with those in the mysql database.
  • Load Statistics Parameter Update: The behavior of the --load-stats parameter has been clarified for v9.0.0; when set to false, BR no longer writes statistics to mysql.stats_meta, requiring manual ANALYZE TABLE post-restore.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 18, 2025
@lilin90 lilin90 assigned lilin90 and unassigned Leavrth Dec 18, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates documentation related to restore operations, specifically for checkpoint restore and snapshot restore. It introduces new features available from v9.0.0, such as storing checkpoint data in external storage and using --fast-load-sys-tables for physical restore of system tables.

My review focuses on improving the clarity and structure of the documentation. I've identified a critical issue where merge conflict markers were left in br/br-checkpoint-restore.md, which will break the document rendering. I've also suggested a significant restructuring of that document to avoid confusion caused by duplicated section headings and to better present the different methods for storing checkpoint data. Additionally, I've provided suggestions to correct inaccuracies (like 'target cluster' instead of 'external storage'), improve sentence structure for better readability, and ensure consistent formatting.

Comment on lines +88 to +162
<<<<<<< HEAD
Before entering the log restore phase during the initial restore, `br` constructs a mapping of upstream and downstream cluster database and table IDs at the `restored-ts` time point. This mapping is persisted in the system table `mysql.tidb_pitr_id_map` to prevent duplicate allocation of database and table IDs. Deleting data from `mysql.tidb_pitr_id_map` might lead to inconsistent PITR restore data.
=======
Note that before entering the log restore phase during the initial restore, `br` constructs a mapping of upstream and downstream cluster database and table IDs at the `restored-ts` time point. This mapping is persisted in the system table `mysql.tidb_pitr_id_map` to prevent duplicate allocation of database and table IDs. **Deleting data from `mysql.tidb_pitr_id_map` arbitrarily might lead to inconsistent PITR restore data.**

> **Note:**
>
> To ensure compatibility with clusters of earlier versions, starting from v9.0.0, if the system table `mysql.tidb_pitr_id_map` does not exist in the restore cluster, the `pitr_id_map` data will be written to the log backup directory. The file name is `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}`.
## Implementation details: store checkpoint data in the external storage

> **Note:**
>
> Starting from v9.0.0, BR stores checkpoint data in the downstream cluster by default. You can specify an external storage for checkpoint data using the `--checkpoint-storage` parameter. For example:
>
> ```shell
> ./br restore full -s "s3://backup-bucket/backup-prefix" --checkpoint-storage "s3://temp-bucket/checkpoints"
> ```
In the external storage, the directory structure of the checkpoint data is as follows:
- Root path `restore-{downstream-cluster-ID}` uses the downstream cluster ID `{downstream-cluster-ID}` to distinguish between different restore clusters.
- Path `restore-{downstream-cluster-ID}/log` stores log file checkpoint data during the log restore phase.
- Path `restore-{downstream-cluster-ID}/sst` stores checkpoint data of the SST files that are not backed up by log backup during the log restore phase.
- Path `restore-{downstream-cluster-ID}/snapshot` stores checkpoint data during the snapshot restore phase.
```
.
`-- restore-{downstream-cluster-ID}
|-- log
| |-- checkpoint.meta
| |-- data
| | |-- {uuid}.cpt
| | |-- {uuid}.cpt
| | `-- {uuid}.cpt
| |-- ingest_index.meta
| `-- progress.meta
|-- snapshot
| |-- checkpoint.meta
| |-- checksum
| | |-- {uuid}.cpt
| | |-- {uuid}.cpt
| | `-- {uuid}.cpt
| `-- data
| |-- {uuid}.cpt
| |-- {uuid}.cpt
| `-- {uuid}.cpt
`-- sst
`-- checkpoint.meta
```
Checkpoint restore operations are divided into two parts: snapshot restore and PITR restore.

### Snapshot restore

During the initial restore, `br` creates a `restore-{downstream-cluster-ID}/snapshot` path in the target cluster. The path records checkpoint data, the upstream cluster ID, and the BackupTS of the backup data.

If the restore fails, you can retry it using the same command. `br` will automatically read the checkpoint information from the specified external storage path and resume from the last restore point.

If the restore fails and you try to restore backup data with different checkpoint information to the same cluster, `br` reports an error. It indicates that the current upstream cluster ID or BackupTS is different from the checkpoint record. If the restore cluster has been cleaned, you can manually clean up the checkpoint data in the external storage or specify another external storage path to store checkpoint data, and retry with a different backup.

### PITR restore

[PITR (Point-in-time recovery)](/br/br-pitr-guide.md) consists of snapshot restore and log restore phases.

During the initial restore, `br` first enters the snapshot restore phase. BR records the checkpoint data, the upstream cluster ID, BackupTS of the backup data (that is, the start time point `start-ts` of log restore) and the restored time point `restored-ts` of log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If restore fails during this phase, you cannot adjust the `start-ts` and `restored-ts` of log restore when resuming checkpoint restore.

When entering the log restore phase during the initial restore, `br` creates a `restore-{downstream-cluster-ID}/log` path in the target cluster. This path records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`). If restore fails during this phase, you need to specify the same `start-ts` and `restored-ts` as recorded in the checkpoint database when retrying. Otherwise, `br` will report an error and prompt that the current specified restore time range or upstream cluster ID is different from the checkpoint record. If the restore cluster has been cleaned, you can manually clean up the checkpoint data in the external storage or specify another external storage path to store checkpoint data, and retry with a different backup.

Note that before entering the log restore phase during the initial restore, `br` constructs a mapping of the database and table IDs in the upstream and downstream clusters at the `restored-ts` time point. This mapping is persisted in the checkpoint storage with the file name `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}` to prevent duplicate allocation of database and table IDs. **Deleting files from the directory `pitr_id_maps` arbitrarily might lead to inconsistent PITR restore data.**

> **Note:**
>
> To ensure compatibility with clusters of earlier versions, starting from v9.0.0, if the system table `mysql.tidb_pitr_id_map` does not exist in the restore cluster and the `--checkpoint-storage` parameter is not specified, the `pitr_id_map` data will be written to the log backup directory. The file name is `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}`.
>>>>>>> 827df4ff8c (restore: update the definition of the parameter --load-stats and the usage of pitr id map (#21078))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

These merge conflict markers (<<<<<<< HEAD, =======, >>>>>>>) were left in the file. They need to be removed to ensure the document renders correctly. Please remove the markers and the content between <<<<<<< HEAD and =======.

>
> To ensure compatibility with clusters of earlier versions, starting from v9.0.0, if the system table `mysql.tidb_pitr_id_map` does not exist in the restore cluster, the `pitr_id_map` data will be written to the log backup directory. The file name is `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}`.
## Implementation details: store checkpoint data in the external storage

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The document structure has become confusing with the addition of this section. There are now two sections with similar headings (## Implementation details and ## Implementation details: store checkpoint data in the external storage), both describing implementation details but for different storage methods. This can be confusing for readers.

To improve clarity, I suggest restructuring this part of the document. For example, you could have a single ## Implementation details section with two subsections:

## Implementation details

Starting from v9.0.0, BR can store checkpoint data in two ways: in the downstream cluster (default) or in an external storage (by specifying `--checkpoint-storage`).

### Storing checkpoint data in the downstream cluster

... (content from the existing section at line 70) ...

### Storing checkpoint data in an external storage

... (content from this new section) ...

This would provide a clearer structure for the user to understand the two different methods for storing checkpoint data.


### Snapshot restore

During the initial restore, `br` creates a `restore-{downstream-cluster-ID}/snapshot` path in the target cluster. The path records checkpoint data, the upstream cluster ID, and the BackupTS of the backup data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The phrase 'in the target cluster' is incorrect here, as this section describes storing data in an external storage. It should be 'in the specified external storage'. Also, the second sentence could be more active.

Suggested change
During the initial restore, `br` creates a `restore-{downstream-cluster-ID}/snapshot` path in the target cluster. The path records checkpoint data, the upstream cluster ID, and the BackupTS of the backup data.
During the initial restore, `br` creates a `restore-{downstream-cluster-ID}/snapshot` path in the specified external storage. In this path, `br` records checkpoint data, the upstream cluster ID, and the BackupTS of the backup data.


During the initial restore, `br` first enters the snapshot restore phase. BR records the checkpoint data, the upstream cluster ID, BackupTS of the backup data (that is, the start time point `start-ts` of log restore) and the restored time point `restored-ts` of log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If restore fails during this phase, you cannot adjust the `start-ts` and `restored-ts` of log restore when resuming checkpoint restore.

When entering the log restore phase during the initial restore, `br` creates a `restore-{downstream-cluster-ID}/log` path in the target cluster. This path records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`). If restore fails during this phase, you need to specify the same `start-ts` and `restored-ts` as recorded in the checkpoint database when retrying. Otherwise, `br` will report an error and prompt that the current specified restore time range or upstream cluster ID is different from the checkpoint record. If the restore cluster has been cleaned, you can manually clean up the checkpoint data in the external storage or specify another external storage path to store checkpoint data, and retry with a different backup.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The phrase 'in the target cluster' is incorrect here, as this section is about storing data in an external storage. It should be 'in the specified external storage'.

Suggested change
When entering the log restore phase during the initial restore, `br` creates a `restore-{downstream-cluster-ID}/log` path in the target cluster. This path records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`). If restore fails during this phase, you need to specify the same `start-ts` and `restored-ts` as recorded in the checkpoint database when retrying. Otherwise, `br` will report an error and prompt that the current specified restore time range or upstream cluster ID is different from the checkpoint record. If the restore cluster has been cleaned, you can manually clean up the checkpoint data in the external storage or specify another external storage path to store checkpoint data, and retry with a different backup.
When entering the log restore phase during the initial restore, `br` creates a `restore-{downstream-cluster-ID}/log` path in the specified external storage. This path records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`).

When the backup and restore feature backs up data, it stores statistics in JSON format within the `backupmeta` file. When restoring data, it loads statistics in JSON format into the cluster. For more information, see [LOAD STATS](/sql-statements/sql-statement-load-stats.md).

Starting from 9.0.0, BR introduces the `--fast-load-sys-tables` parameter, which is enabled by default. When restoring data to a new cluster using the `br` command-line tool, and the IDs of tables and partitions between the upstream and downstream clusters can be reused (otherwise, BR will automatically fall back to logically load statistics), enabling `--fast-load-sys-tables` lets BR to first restore the statistics-related system tables to the temporary system database `__TiDB_BR_Temporary_mysql`, and then atomically swap these tables with the corresponding tables in the `mysql` database using the `RENAME TABLE` statement.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This sentence is very long and contains a lot of information, making it difficult to parse. For better readability, consider breaking it down into smaller, more focused sentences. Also, there are a few minor formatting and grammar issues. For example, 9.0.0 should be v9.0.0 for consistency, and some terms should be enclosed in backticks.

Suggested change
Starting from 9.0.0, BR introduces the `--fast-load-sys-tables` parameter, which is enabled by default. When restoring data to a new cluster using the `br` command-line tool, and the IDs of tables and partitions between the upstream and downstream clusters can be reused (otherwise, BR will automatically fall back to logically load statistics), enabling `--fast-load-sys-tables` lets BR to first restore the statistics-related system tables to the temporary system database `__TiDB_BR_Temporary_mysql`, and then atomically swap these tables with the corresponding tables in the `mysql` database using the `RENAME TABLE` statement.
Starting from v9.0.0, BR introduces the `--fast-load-sys-tables` parameter, which is enabled by default to enable physical restore of statistics. When you restore data to a new cluster using the `br` command-line tool, if the IDs of tables and partitions can be reused between the upstream and downstream clusters, BR first restores the statistics-related system tables to the temporary system database `__TiDB_BR_Temporary_mysql`. Then, it atomically swaps these tables with the corresponding tables in the `mysql` database using the `RENAME TABLE` statement. If table and partition IDs cannot be reused, BR automatically falls back to logically loading statistics.

In the external storage, the directory structure of the checkpoint data is as follows:
- Root path `restore-{downstream-cluster-ID}` uses the downstream cluster ID `{downstream-cluster-ID}` to distinguish between different restore clusters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

This sentence is a bit redundant. For better readability, consider rephrasing it.

Suggested change
- Root path `restore-{downstream-cluster-ID}` uses the downstream cluster ID `{downstream-cluster-ID}` to distinguish between different restore clusters.
- The root path is `restore-{downstream-cluster-ID}`, where `{downstream-cluster-ID}` is the downstream cluster ID used to distinguish between different restore clusters.


[PITR (Point-in-time recovery)](/br/br-pitr-guide.md) consists of snapshot restore and log restore phases.

During the initial restore, `br` first enters the snapshot restore phase. BR records the checkpoint data, the upstream cluster ID, BackupTS of the backup data (that is, the start time point `start-ts` of log restore) and the restored time point `restored-ts` of log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If restore fails during this phase, you cannot adjust the `start-ts` and `restored-ts` of log restore when resuming checkpoint restore.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

This sentence is quite long and complex, which can make it hard to read. Consider breaking it into smaller sentences for better clarity.

Suggested change
During the initial restore, `br` first enters the snapshot restore phase. BR records the checkpoint data, the upstream cluster ID, BackupTS of the backup data (that is, the start time point `start-ts` of log restore) and the restored time point `restored-ts` of log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If restore fails during this phase, you cannot adjust the `start-ts` and `restored-ts` of log restore when resuming checkpoint restore.
During the initial restore, `br` first enters the snapshot restore phase. In this phase, BR records the checkpoint data, the upstream cluster ID, the BackupTS of the backup data (which is the start time point `start-ts` of the log restore), and the restored time point `restored-ts` of the log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If the restore fails during this phase, you cannot adjust `start-ts` and `restored-ts` when resuming the checkpoint restore.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 18, 2025

@ti-chi-bot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-verify 14ef196 link true /test pull-verify

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm size/L Denotes a PR that changes 100-499 lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR. type/compatibility-or-feature-change This PR involves compatibility changes or feature behavior changes. v9.0-beta.2 This PR/issue applies to TiDB v9.0-beta.2.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants