Skip to content

Comments

Supports configuring TCP Keepalive related parameters in Bookie Client.#4683

Merged
zymap merged 2 commits intoapache:masterfrom
wolfstudy:add-netty-options
Nov 24, 2025
Merged

Supports configuring TCP Keepalive related parameters in Bookie Client.#4683
zymap merged 2 commits intoapache:masterfrom
wolfstudy:add-netty-options

Conversation

@wolfstudy
Copy link
Member

Motivation

In a private environment, network connectivity between data centers is limited by firewall configuration. When the Broker accesses Bookkeeper infrequently, exceeding the firewall's default deactivation time (20 minutes), the firewall will actively disconnect the Broker → Bookkeeper connection.
At this point, if a new production or consumption request is made, the Broker cannot safely access the Bookkeeper node again, causing production/consumption failures that cannot be automatically recovered from. The error message primarily indicates connection-level anomalies:
Clipboard_Screenshot_1763091013

The current Broker → Bookkeeper access chain is based on the Netty communication framework, and the keepalive function at the TCP connection layer reuses the system default SO_KEEPALIVE. The code is PerChannelBookieClientconnect(), as follows:
Clipboard_Screenshot_1763091110

If no policy is explicitly set, the "System Default" option in the table follow will be used directly:

TCP_KEEPIDLE 7200s
TCP_KEEPINTVL 75s
TCP_KEEPCNT 9

Changes

Include the following three configuration items in ClientConfiguration:

    public static final String TCP_KEEPIDLE = "tcpKeepIdle";
    public static final String TCP_KEEPINTVL = "tcpKeepIntvl";
    public static final String TCP_KEEPCNT = "tcpKeepCnt";

To maintain compatibility, the system default configuration will still be used by default.

Signed-off-by: xiaolongran <ranxiaolong716@gmail.com>
@wolfstudy wolfstudy removed their assignment Nov 14, 2025
@zymap zymap added this to the 4.18.0 milestone Nov 14, 2025
Signed-off-by: xiaolongran <ranxiaolong716@gmail.com>
Copy link
Member

@StevenLuMT StevenLuMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good jobs

@StevenLuMT
Copy link
Member

rerun failure checks

3 similar comments
@wolfstudy
Copy link
Member Author

rerun failure checks

@wolfstudy
Copy link
Member Author

rerun failure checks

@wolfstudy
Copy link
Member Author

rerun failure checks

@zymap zymap merged commit ea7884a into apache:master Nov 24, 2025
102 of 118 checks passed
lhotari pushed a commit that referenced this pull request Nov 28, 2025
…t. (#4683)

### Motivation
In a private environment, network connectivity between data centers is limited by firewall configuration. When the Broker accesses Bookkeeper infrequently, exceeding the firewall's default deactivation time (20 minutes), the firewall will actively disconnect the Broker → Bookkeeper connection.
At this point, if a new production or consumption request is made, the Broker cannot safely access the Bookkeeper node again, causing production/consumption failures that cannot be automatically recovered from. The error message primarily indicates connection-level anomalies:
<img width="2944" height="738" alt="Clipboard_Screenshot_1763091013" src="https://github.com/user-attachments/assets/e58b53b9-1c79-49cd-a4d7-1040f52569e0" />

The current Broker → Bookkeeper access chain is based on the Netty communication framework, and the keepalive function at the TCP connection layer reuses the system default SO_KEEPALIVE. The code is `PerChannelBookieClient` → `connect()`, as follows:
<img width="1982" height="434" alt="Clipboard_Screenshot_1763091110" src="https://github.com/user-attachments/assets/979716b6-d99a-4dfd-9829-cd724c597c20" />

If no policy is explicitly set, the "System Default" option in the table follow will be used directly:

TCP_KEEPIDLE | 7200s
-- | --
TCP_KEEPINTVL | 75s
TCP_KEEPCNT | 9

### Changes

Include the following three configuration items in ClientConfiguration:

```
    public static final String TCP_KEEPIDLE = "tcpKeepIdle";
    public static final String TCP_KEEPINTVL = "tcpKeepIntvl";
    public static final String TCP_KEEPCNT = "tcpKeepCnt";
```

To maintain compatibility, the system default configuration will still be used by default.

(cherry picked from commit ea7884a)
priyanshu-ctds pushed a commit to datastax/bookkeeper that referenced this pull request Dec 2, 2025
…t. (apache#4683)

### Motivation
In a private environment, network connectivity between data centers is limited by firewall configuration. When the Broker accesses Bookkeeper infrequently, exceeding the firewall's default deactivation time (20 minutes), the firewall will actively disconnect the Broker → Bookkeeper connection.
At this point, if a new production or consumption request is made, the Broker cannot safely access the Bookkeeper node again, causing production/consumption failures that cannot be automatically recovered from. The error message primarily indicates connection-level anomalies:
<img width="2944" height="738" alt="Clipboard_Screenshot_1763091013" src="https://github.com/user-attachments/assets/e58b53b9-1c79-49cd-a4d7-1040f52569e0" />

The current Broker → Bookkeeper access chain is based on the Netty communication framework, and the keepalive function at the TCP connection layer reuses the system default SO_KEEPALIVE. The code is `PerChannelBookieClient` → `connect()`, as follows:
<img width="1982" height="434" alt="Clipboard_Screenshot_1763091110" src="https://github.com/user-attachments/assets/979716b6-d99a-4dfd-9829-cd724c597c20" />

If no policy is explicitly set, the "System Default" option in the table follow will be used directly:

TCP_KEEPIDLE | 7200s
-- | --
TCP_KEEPINTVL | 75s
TCP_KEEPCNT | 9

### Changes

Include the following three configuration items in ClientConfiguration:

```
    public static final String TCP_KEEPIDLE = "tcpKeepIdle";
    public static final String TCP_KEEPINTVL = "tcpKeepIntvl";
    public static final String TCP_KEEPCNT = "tcpKeepCnt";
```

To maintain compatibility, the system default configuration will still be used by default.

(cherry picked from commit ea7884a)
(cherry picked from commit e199e8d)
srinath-ctds pushed a commit to datastax/bookkeeper that referenced this pull request Dec 4, 2025
…t. (apache#4683)

### Motivation
In a private environment, network connectivity between data centers is limited by firewall configuration. When the Broker accesses Bookkeeper infrequently, exceeding the firewall's default deactivation time (20 minutes), the firewall will actively disconnect the Broker → Bookkeeper connection.
At this point, if a new production or consumption request is made, the Broker cannot safely access the Bookkeeper node again, causing production/consumption failures that cannot be automatically recovered from. The error message primarily indicates connection-level anomalies:
<img width="2944" height="738" alt="Clipboard_Screenshot_1763091013" src="https://github.com/user-attachments/assets/e58b53b9-1c79-49cd-a4d7-1040f52569e0" />

The current Broker → Bookkeeper access chain is based on the Netty communication framework, and the keepalive function at the TCP connection layer reuses the system default SO_KEEPALIVE. The code is `PerChannelBookieClient` → `connect()`, as follows:
<img width="1982" height="434" alt="Clipboard_Screenshot_1763091110" src="https://github.com/user-attachments/assets/979716b6-d99a-4dfd-9829-cd724c597c20" />

If no policy is explicitly set, the "System Default" option in the table follow will be used directly:

TCP_KEEPIDLE | 7200s
-- | --
TCP_KEEPINTVL | 75s
TCP_KEEPCNT | 9

### Changes

Include the following three configuration items in ClientConfiguration:

```
    public static final String TCP_KEEPIDLE = "tcpKeepIdle";
    public static final String TCP_KEEPINTVL = "tcpKeepIntvl";
    public static final String TCP_KEEPCNT = "tcpKeepCnt";
```

To maintain compatibility, the system default configuration will still be used by default.

(cherry picked from commit ea7884a)
(cherry picked from commit e199e8d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants