Skip to content

Conversation

@okJiang
Copy link
Member

@okJiang okJiang commented Dec 16, 2025

First-time contributors' checklist

What is changed, added or deleted? (Required)

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions (in Chinese).

  • master (the latest development version)
  • v9.0 (TiDB 9.0 versions)
  • v8.5 (TiDB 8.5 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)

What is the related PR or file link(s)?

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

Signed-off-by: okjiang <819421878@qq.com>
@ti-chi-bot ti-chi-bot bot added missing-translation-status This PR does not have translation status info. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 16, 2025
@okJiang
Copy link
Member Author

okJiang commented Dec 16, 2025

/cc @rleungx @LykxSassinator

@ti-chi-bot ti-chi-bot bot requested a review from rleungx December 16, 2025 10:15
@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 16, 2025

@okJiang: GitHub didn't allow me to request PR reviews from the following users: LykxSassinator.

Note that only pingcap members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @rleungx @LykxSassinator

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 16, 2025

@rleungx: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@qiancai qiancai added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Dec 17, 2025
+ 默认值:100ms
+ 最小值:1ms

### `inspect-network-interval`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### `inspect-network-interval`
### `inspect-network-interval` <span class="version-mark">从 v8.5.5 和 v9.0.0 版本开始引入</span>

@qiancai
Copy link
Collaborator

qiancai commented Dec 17, 2025

/bot-review

>> scheduler config evict-leader-scheduler add-store 2 // 为 store 2 添加 leader 驱逐调度
>> scheduler config evict-leader-scheduler delete-store 2 // 为 store 2 移除 leader 驱逐调度
>> scheduler add evict-slow-store-scheduler // 当有且仅有一个 slow store 时将该 store 上的所有 Region leader 驱逐出去
>> scheduler add evict-slow-store-scheduler // 自动检测磁盘或网络慢节点,并在满足条件时将该 store 上的所有 Region leader 驱逐出去

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

优化后的描述更准确地说明了调度器的功能,但‘磁盘或网络慢节点’的表述可以更具体,以明确其检测的是磁盘或网络的慢节点。同时,建议将‘满足条件时’具体化,以提升文档的清晰度。

Suggested change
>> scheduler add evict-slow-store-scheduler // 自动检测磁盘或网络慢节点,并在满足条件时将该 store 上的所有 Region leader 驱逐出去
>> scheduler add evict-slow-store-scheduler // 自动检测磁盘慢节点或网络慢节点,并在检测到慢节点时将该 store 上的所有 Region leader 驱逐出去


### `scheduler config evict-slow-store-scheduler`

`evict-slow-store-scheduler` 用于在 TiKV 节点出现磁盘 I/O 或网络抖动时,阻断 PD 向异常节点调度 leader,并在必要时主动驱逐 leader。TiKV 会在 store 心跳中同时上报 `SlowScore`(磁盘)与 `NetworkSlowScore`(网络),分值范围均为 1~100,数值越大代表该节点越可能异常。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原文对 SlowScoreNetworkSlowScore 的解释不够清晰,容易让用户混淆。建议明确说明这两个分数分别由磁盘和网络探测产生,并说明 PD 如何根据它们做出决策。

Suggested change
`evict-slow-store-scheduler` 用于在 TiKV 节点出现磁盘 I/O 或网络抖动时,阻断 PD 向异常节点调度 leader,并在必要时主动驱逐 leader。TiKV 会在 store 心跳中同时上报 `SlowScore`(磁盘)与 `NetworkSlowScore`网络),分值范围均为 1~100,数值越大代表该节点越可能异常。
`evict-slow-store-scheduler` 用于在 TiKV 节点出现磁盘 I/O 或网络抖动时,阻断 PD 向异常节点调度 leader,并在必要时主动驱逐 leader。TiKV 会在 store 心跳中同时上报 `SlowScore`(磁盘 I/O 探测产生)与 `NetworkSlowScore`网络探测产生),分值范围均为 1~100,数值越大代表该节点越可能异常。PD 会综合这两个分数来判断节点是否为慢节点


### `inspect-network-interval`

+ 控制 TiKV HealthChecker 主动向 PD 以及其他 TiKV 节点发起网络探测的周期,用于计算 `NetworkSlowScore` 并向 PD 上报慢节点的网络状态。
Copy link

@github-actions github-actions bot Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该句描述不够清晰,建议明确说明网络探测的目的和NetworkSlowScore的作用,使逻辑更连贯。

Suggested change
+ 控制 TiKV HealthChecker 主动向 PD 以及其他 TiKV 节点发起网络探测的周期,用于计算 `NetworkSlowScore` 并向 PD 上报慢节点的网络状态
+ 控制 TiKV HealthChecker 主动向 PD 以及其他 TiKV 节点发起网络探测的周期。探测结果用于计算 `NetworkSlowScore`,该分数将上报给 PD 以反映慢节点的网络状态

### `inspect-network-interval`

+ 控制 TiKV HealthChecker 主动向 PD 以及其他 TiKV 节点发起网络探测的周期,用于计算 `NetworkSlowScore` 并向 PD 上报慢节点的网络状态。
+ 设置为 `0` 表示关闭网络探测。数值越小,采样频率越高,能够更快放大网络抖动,但也会消耗更多网络与 CPU 资源。
Copy link

@github-actions github-actions bot Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原文‘放大网络抖动’表述不够准确,且‘数值越小’与‘采样频率越高’的逻辑关系可以更直接地表达。

Suggested change
+ 设置为 `0` 表示关闭网络探测。数值越小,采样频率越高,能够更快放大网络抖动,但也会消耗更多网络与 CPU 资源。
+ 设置为 `0` 表示关闭网络探测。数值越小,探测频率越高,能更灵敏地检测到网络延迟,但也会消耗更多网络与 CPU 资源。

@github-actions
Copy link

✅ AI review completed, 7 comments generated.

@pingcap pingcap deleted a comment from github-actions bot Dec 17, 2025
Signed-off-by: okjiang <819421878@qq.com>
- `pending`:表示当前调度器无法产生调度。`pending` 状态的调度器,会返回一个概览信息,来帮助用户诊断。概览信息包含了 store 的一些状态信息,解释了它们为什么不能被选中进行调度。
- `normal`:表示当前调度器无需进行调度。

### `scheduler config evict-slow-store-scheduler`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### `scheduler config evict-slow-store-scheduler`
### `scheduler config evict-slow-store-scheduler` <span class="version-mark">从 v8.5.5 和 v9.0.0 版本开始引入</span>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不是新引入的,只有里面的 enable-network-slow-store 是新引入的

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@okJiang
Copy link
Member Author

okJiang commented Dec 17, 2025

/cc @LykxSassinator

@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 17, 2025

@okJiang: GitHub didn't allow me to request PR reviews from the following users: LykxSassinator.

Note that only pingcap members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @LykxSassinator

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@qiancai qiancai self-assigned this Dec 17, 2025
@qiancai qiancai added the translation/doing This PR’s assignee is translating this PR. label Dec 17, 2025
@ti-chi-bot ti-chi-bot bot removed the missing-translation-status This PR does not have translation status info. label Dec 17, 2025
@qiancai qiancai added the v9.0-beta.3 This PR/issue applies to TiDB v9.0-beta.3. label Dec 17, 2025
Copy link
Contributor

@LykxSassinator LykxSassinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from qiancai. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 17, 2025

@okJiang: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-verify c3069cd link true /test pull-verify

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. translation/doing This PR’s assignee is translating this PR. v9.0-beta.3 This PR/issue applies to TiDB v9.0-beta.3.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants