-
Notifications
You must be signed in to change notification settings - Fork 708
restore: update the definition of the parameter --load-stats and the usage of pitr id map (#21078) #22200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-8.5
Are you sure you want to change the base?
restore: update the definition of the parameter --load-stats and the usage of pitr id map (#21078) #22200
Conversation
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
|
@Leavrth This PR has conflicts, I have hold it. |
|
@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello @ti-chi-bot, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request updates the documentation for TiDB's Backup & Restore (BR) tool, enhancing clarity around Point-in-Time Recovery (PITR) ID map management and introducing new features for checkpoint data storage and system table restoration. The changes provide users with more detailed guidance on configuring and understanding BR's behavior, particularly for newer versions, and introduce a more efficient method for restoring system tables. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates documentation related to restore operations, specifically for checkpoint restore and snapshot restore. It introduces new features available from v9.0.0, such as storing checkpoint data in external storage and using --fast-load-sys-tables for physical restore of system tables.
My review focuses on improving the clarity and structure of the documentation. I've identified a critical issue where merge conflict markers were left in br/br-checkpoint-restore.md, which will break the document rendering. I've also suggested a significant restructuring of that document to avoid confusion caused by duplicated section headings and to better present the different methods for storing checkpoint data. Additionally, I've provided suggestions to correct inaccuracies (like 'target cluster' instead of 'external storage'), improve sentence structure for better readability, and ensure consistent formatting.
| <<<<<<< HEAD | ||
| Before entering the log restore phase during the initial restore, `br` constructs a mapping of upstream and downstream cluster database and table IDs at the `restored-ts` time point. This mapping is persisted in the system table `mysql.tidb_pitr_id_map` to prevent duplicate allocation of database and table IDs. Deleting data from `mysql.tidb_pitr_id_map` might lead to inconsistent PITR restore data. | ||
| ======= | ||
| Note that before entering the log restore phase during the initial restore, `br` constructs a mapping of upstream and downstream cluster database and table IDs at the `restored-ts` time point. This mapping is persisted in the system table `mysql.tidb_pitr_id_map` to prevent duplicate allocation of database and table IDs. **Deleting data from `mysql.tidb_pitr_id_map` arbitrarily might lead to inconsistent PITR restore data.** | ||
|
|
||
| > **Note:** | ||
| > | ||
| > To ensure compatibility with clusters of earlier versions, starting from v9.0.0, if the system table `mysql.tidb_pitr_id_map` does not exist in the restore cluster, the `pitr_id_map` data will be written to the log backup directory. The file name is `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}`. | ||
| ## Implementation details: store checkpoint data in the external storage | ||
|
|
||
| > **Note:** | ||
| > | ||
| > Starting from v9.0.0, BR stores checkpoint data in the downstream cluster by default. You can specify an external storage for checkpoint data using the `--checkpoint-storage` parameter. For example: | ||
| > | ||
| > ```shell | ||
| > ./br restore full -s "s3://backup-bucket/backup-prefix" --checkpoint-storage "s3://temp-bucket/checkpoints" | ||
| > ``` | ||
| In the external storage, the directory structure of the checkpoint data is as follows: | ||
| - Root path `restore-{downstream-cluster-ID}` uses the downstream cluster ID `{downstream-cluster-ID}` to distinguish between different restore clusters. | ||
| - Path `restore-{downstream-cluster-ID}/log` stores log file checkpoint data during the log restore phase. | ||
| - Path `restore-{downstream-cluster-ID}/sst` stores checkpoint data of the SST files that are not backed up by log backup during the log restore phase. | ||
| - Path `restore-{downstream-cluster-ID}/snapshot` stores checkpoint data during the snapshot restore phase. | ||
| ``` | ||
| . | ||
| `-- restore-{downstream-cluster-ID} | ||
| |-- log | ||
| | |-- checkpoint.meta | ||
| | |-- data | ||
| | | |-- {uuid}.cpt | ||
| | | |-- {uuid}.cpt | ||
| | | `-- {uuid}.cpt | ||
| | |-- ingest_index.meta | ||
| | `-- progress.meta | ||
| |-- snapshot | ||
| | |-- checkpoint.meta | ||
| | |-- checksum | ||
| | | |-- {uuid}.cpt | ||
| | | |-- {uuid}.cpt | ||
| | | `-- {uuid}.cpt | ||
| | `-- data | ||
| | |-- {uuid}.cpt | ||
| | |-- {uuid}.cpt | ||
| | `-- {uuid}.cpt | ||
| `-- sst | ||
| `-- checkpoint.meta | ||
| ``` | ||
| Checkpoint restore operations are divided into two parts: snapshot restore and PITR restore. | ||
|
|
||
| ### Snapshot restore | ||
|
|
||
| During the initial restore, `br` creates a `restore-{downstream-cluster-ID}/snapshot` path in the target cluster. The path records checkpoint data, the upstream cluster ID, and the BackupTS of the backup data. | ||
|
|
||
| If the restore fails, you can retry it using the same command. `br` will automatically read the checkpoint information from the specified external storage path and resume from the last restore point. | ||
|
|
||
| If the restore fails and you try to restore backup data with different checkpoint information to the same cluster, `br` reports an error. It indicates that the current upstream cluster ID or BackupTS is different from the checkpoint record. If the restore cluster has been cleaned, you can manually clean up the checkpoint data in the external storage or specify another external storage path to store checkpoint data, and retry with a different backup. | ||
|
|
||
| ### PITR restore | ||
|
|
||
| [PITR (Point-in-time recovery)](/br/br-pitr-guide.md) consists of snapshot restore and log restore phases. | ||
|
|
||
| During the initial restore, `br` first enters the snapshot restore phase. BR records the checkpoint data, the upstream cluster ID, BackupTS of the backup data (that is, the start time point `start-ts` of log restore) and the restored time point `restored-ts` of log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If restore fails during this phase, you cannot adjust the `start-ts` and `restored-ts` of log restore when resuming checkpoint restore. | ||
|
|
||
| When entering the log restore phase during the initial restore, `br` creates a `restore-{downstream-cluster-ID}/log` path in the target cluster. This path records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`). If restore fails during this phase, you need to specify the same `start-ts` and `restored-ts` as recorded in the checkpoint database when retrying. Otherwise, `br` will report an error and prompt that the current specified restore time range or upstream cluster ID is different from the checkpoint record. If the restore cluster has been cleaned, you can manually clean up the checkpoint data in the external storage or specify another external storage path to store checkpoint data, and retry with a different backup. | ||
|
|
||
| Note that before entering the log restore phase during the initial restore, `br` constructs a mapping of the database and table IDs in the upstream and downstream clusters at the `restored-ts` time point. This mapping is persisted in the checkpoint storage with the file name `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}` to prevent duplicate allocation of database and table IDs. **Deleting files from the directory `pitr_id_maps` arbitrarily might lead to inconsistent PITR restore data.** | ||
|
|
||
| > **Note:** | ||
| > | ||
| > To ensure compatibility with clusters of earlier versions, starting from v9.0.0, if the system table `mysql.tidb_pitr_id_map` does not exist in the restore cluster and the `--checkpoint-storage` parameter is not specified, the `pitr_id_map` data will be written to the log backup directory. The file name is `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}`. | ||
| >>>>>>> 827df4ff8c (restore: update the definition of the parameter --load-stats and the usage of pitr id map (#21078)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| > | ||
| > To ensure compatibility with clusters of earlier versions, starting from v9.0.0, if the system table `mysql.tidb_pitr_id_map` does not exist in the restore cluster, the `pitr_id_map` data will be written to the log backup directory. The file name is `pitr_id_maps/pitr_id_map.cluster_id:{downstream-cluster-ID}.restored_ts:{restored-ts}`. | ||
| ## Implementation details: store checkpoint data in the external storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The document structure has become confusing with the addition of this section. There are now two sections with similar headings (## Implementation details and ## Implementation details: store checkpoint data in the external storage), both describing implementation details but for different storage methods. This can be confusing for readers.
To improve clarity, I suggest restructuring this part of the document. For example, you could have a single ## Implementation details section with two subsections:
## Implementation details
Starting from v9.0.0, BR can store checkpoint data in two ways: in the downstream cluster (default) or in an external storage (by specifying `--checkpoint-storage`).
### Storing checkpoint data in the downstream cluster
... (content from the existing section at line 70) ...
### Storing checkpoint data in an external storage
... (content from this new section) ...This would provide a clearer structure for the user to understand the two different methods for storing checkpoint data.
|
|
||
| ### Snapshot restore | ||
|
|
||
| During the initial restore, `br` creates a `restore-{downstream-cluster-ID}/snapshot` path in the target cluster. The path records checkpoint data, the upstream cluster ID, and the BackupTS of the backup data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phrase 'in the target cluster' is incorrect here, as this section describes storing data in an external storage. It should be 'in the specified external storage'. Also, the second sentence could be more active.
| During the initial restore, `br` creates a `restore-{downstream-cluster-ID}/snapshot` path in the target cluster. The path records checkpoint data, the upstream cluster ID, and the BackupTS of the backup data. | |
| During the initial restore, `br` creates a `restore-{downstream-cluster-ID}/snapshot` path in the specified external storage. In this path, `br` records checkpoint data, the upstream cluster ID, and the BackupTS of the backup data. |
|
|
||
| During the initial restore, `br` first enters the snapshot restore phase. BR records the checkpoint data, the upstream cluster ID, BackupTS of the backup data (that is, the start time point `start-ts` of log restore) and the restored time point `restored-ts` of log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If restore fails during this phase, you cannot adjust the `start-ts` and `restored-ts` of log restore when resuming checkpoint restore. | ||
|
|
||
| When entering the log restore phase during the initial restore, `br` creates a `restore-{downstream-cluster-ID}/log` path in the target cluster. This path records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`). If restore fails during this phase, you need to specify the same `start-ts` and `restored-ts` as recorded in the checkpoint database when retrying. Otherwise, `br` will report an error and prompt that the current specified restore time range or upstream cluster ID is different from the checkpoint record. If the restore cluster has been cleaned, you can manually clean up the checkpoint data in the external storage or specify another external storage path to store checkpoint data, and retry with a different backup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phrase 'in the target cluster' is incorrect here, as this section is about storing data in an external storage. It should be 'in the specified external storage'.
| When entering the log restore phase during the initial restore, `br` creates a `restore-{downstream-cluster-ID}/log` path in the target cluster. This path records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`). If restore fails during this phase, you need to specify the same `start-ts` and `restored-ts` as recorded in the checkpoint database when retrying. Otherwise, `br` will report an error and prompt that the current specified restore time range or upstream cluster ID is different from the checkpoint record. If the restore cluster has been cleaned, you can manually clean up the checkpoint data in the external storage or specify another external storage path to store checkpoint data, and retry with a different backup. | |
| When entering the log restore phase during the initial restore, `br` creates a `restore-{downstream-cluster-ID}/log` path in the specified external storage. This path records checkpoint data, the upstream cluster ID, and the restore time range (`start-ts` and `restored-ts`). |
| When the backup and restore feature backs up data, it stores statistics in JSON format within the `backupmeta` file. When restoring data, it loads statistics in JSON format into the cluster. For more information, see [LOAD STATS](/sql-statements/sql-statement-load-stats.md). | ||
|
|
||
| Starting from 9.0.0, BR introduces the `--fast-load-sys-tables` parameter, which is enabled by default. When restoring data to a new cluster using the `br` command-line tool, and the IDs of tables and partitions between the upstream and downstream clusters can be reused (otherwise, BR will automatically fall back to logically load statistics), enabling `--fast-load-sys-tables` lets BR to first restore the statistics-related system tables to the temporary system database `__TiDB_BR_Temporary_mysql`, and then atomically swap these tables with the corresponding tables in the `mysql` database using the `RENAME TABLE` statement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is very long and contains a lot of information, making it difficult to parse. For better readability, consider breaking it down into smaller, more focused sentences. Also, there are a few minor formatting and grammar issues. For example, 9.0.0 should be v9.0.0 for consistency, and some terms should be enclosed in backticks.
| Starting from 9.0.0, BR introduces the `--fast-load-sys-tables` parameter, which is enabled by default. When restoring data to a new cluster using the `br` command-line tool, and the IDs of tables and partitions between the upstream and downstream clusters can be reused (otherwise, BR will automatically fall back to logically load statistics), enabling `--fast-load-sys-tables` lets BR to first restore the statistics-related system tables to the temporary system database `__TiDB_BR_Temporary_mysql`, and then atomically swap these tables with the corresponding tables in the `mysql` database using the `RENAME TABLE` statement. | |
| Starting from v9.0.0, BR introduces the `--fast-load-sys-tables` parameter, which is enabled by default to enable physical restore of statistics. When you restore data to a new cluster using the `br` command-line tool, if the IDs of tables and partitions can be reused between the upstream and downstream clusters, BR first restores the statistics-related system tables to the temporary system database `__TiDB_BR_Temporary_mysql`. Then, it atomically swaps these tables with the corresponding tables in the `mysql` database using the `RENAME TABLE` statement. If table and partition IDs cannot be reused, BR automatically falls back to logically loading statistics. |
| In the external storage, the directory structure of the checkpoint data is as follows: | ||
| - Root path `restore-{downstream-cluster-ID}` uses the downstream cluster ID `{downstream-cluster-ID}` to distinguish between different restore clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is a bit redundant. For better readability, consider rephrasing it.
| - Root path `restore-{downstream-cluster-ID}` uses the downstream cluster ID `{downstream-cluster-ID}` to distinguish between different restore clusters. | |
| - The root path is `restore-{downstream-cluster-ID}`, where `{downstream-cluster-ID}` is the downstream cluster ID used to distinguish between different restore clusters. |
|
|
||
| [PITR (Point-in-time recovery)](/br/br-pitr-guide.md) consists of snapshot restore and log restore phases. | ||
|
|
||
| During the initial restore, `br` first enters the snapshot restore phase. BR records the checkpoint data, the upstream cluster ID, BackupTS of the backup data (that is, the start time point `start-ts` of log restore) and the restored time point `restored-ts` of log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If restore fails during this phase, you cannot adjust the `start-ts` and `restored-ts` of log restore when resuming checkpoint restore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is quite long and complex, which can make it hard to read. Consider breaking it into smaller sentences for better clarity.
| During the initial restore, `br` first enters the snapshot restore phase. BR records the checkpoint data, the upstream cluster ID, BackupTS of the backup data (that is, the start time point `start-ts` of log restore) and the restored time point `restored-ts` of log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If restore fails during this phase, you cannot adjust the `start-ts` and `restored-ts` of log restore when resuming checkpoint restore. | |
| During the initial restore, `br` first enters the snapshot restore phase. In this phase, BR records the checkpoint data, the upstream cluster ID, the BackupTS of the backup data (which is the start time point `start-ts` of the log restore), and the restored time point `restored-ts` of the log restore in the `restore-{downstream-cluster-ID}/snapshot` path. If the restore fails during this phase, you cannot adjust `start-ts` and `restored-ts` when resuming the checkpoint restore. |
|
@ti-chi-bot: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This is an automated cherry-pick of #21078
First-time contributors' checklist
What is changed, added or deleted? (Required)
Which TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions.
What is the related PR or file link(s)?
--load-statsand the usage of pitr id map docs-cn#20346Do your changes match any of the following descriptions?