Skip to content

Commit 1fdcd73

Browse files
authored
Merge pull request #89 from PranaviAncha/PranaviAncha/docs
Document performance characteristics and safe usage patterns for messaging service API
2 parents fd51651 + b2e3725 commit 1fdcd73

File tree

11 files changed

+546
-78
lines changed

11 files changed

+546
-78
lines changed

helix-core/src/main/java/org/apache/helix/ClusterMessagingService.java

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,13 @@
3737
public interface ClusterMessagingService {
3838
/**
3939
* Send message matching the specifications mentioned in recipientCriteria.
40+
*
41+
* <p><b>PERFORMANCE WARNING:</b> When recipientCriteria uses {@link DataSource#EXTERNALVIEW}
42+
* with wildcard or unspecified resource names, this scans <b>ALL</b> ExternalView znodes in the cluster,
43+
* regardless of other criteria like instanceName. At scale, this causes
44+
* severe performance degradation. Use {@link DataSource#LIVEINSTANCES} when you don't need
45+
* resource/partition filtering, or specify exact resource names when using EXTERNALVIEW.
46+
*
4047
* @param recipientCriteria criteria to be met, defined as {@link Criteria}
4148
* @See Criteria
4249
* @param message
@@ -54,6 +61,7 @@ public interface ClusterMessagingService {
5461
* This method will return after sending the messages. <br>
5562
* This is useful when message need to be sent and current thread need not
5663
* wait for response since processing will be done in another thread.
64+
*
5765
* @see #send(Criteria, Message)
5866
* @param recipientCriteria
5967
* @param message
@@ -85,7 +93,8 @@ int send(Criteria recipientCriteria, Message message, AsyncCallback callbackOnRe
8593
* for response. <br>
8694
* The current thread can use callbackOnReply instance to store application
8795
* specific data.
88-
* @see #send(Criteria, Message, AsyncCallback, int)
96+
*
97+
* @see #send(Criteria, Message)
8998
* @param recipientCriteria
9099
* @param message
91100
* @param callbackOnReply
@@ -96,7 +105,7 @@ int sendAndWait(Criteria recipientCriteria, Message message, AsyncCallback callb
96105
int timeOut);
97106

98107
/**
99-
* @see #send(Criteria, Message, AsyncCallback, int, int)
108+
* @see #send(Criteria, Message)
100109
* @param receipientCriteria
101110
* @param message
102111
* @param callbackOnReply
@@ -143,6 +152,8 @@ int sendAndWait(Criteria receipientCriteria, Message message, AsyncCallback call
143152
/**
144153
* This will generate all messages to be sent given the recipientCriteria and MessageTemplate,
145154
* the messages are not sent.
155+
*
156+
* @see #send(Criteria, Message)
146157
* @param recipientCriteria criteria to be met, defined as {@link Criteria}
147158
* @param messageTemplate the Message on which to base the messages to send
148159
* @return messages to be sent, grouped by the type of instance to send the message to

helix-core/src/main/java/org/apache/helix/Criteria.java

Lines changed: 47 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,43 @@
2020
*/
2121

2222
/**
23-
* Describes various properties that operations involving {@link Message} delivery will follow.
23+
* Specifies recipient criteria for message delivery in a Helix cluster.
24+
*
25+
* <p><b>PERFORMANCE WARNING:</b> Using {@link DataSource#EXTERNALVIEW} with wildcard or unspecified
26+
* resource names causes Helix to scan ALL ExternalView znodes in the cluster, regardless of other
27+
* criteria fields. At scale, this causes severe performance degradation.
28+
*
29+
* <p><b>Example - Efficient Pattern:</b>
30+
* <pre>
31+
* // GOOD: Target specific live instance
32+
* Criteria criteria = new Criteria();
33+
* criteria.setInstanceName("host_1234");
34+
* criteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
35+
* criteria.setDataSource(DataSource.LIVEINSTANCES); // Fast
36+
* criteria.setSessionSpecific(true);
37+
*
38+
* // BAD: Wildcard resource with ExternalView
39+
* Criteria criteria = new Criteria();
40+
* criteria.setInstanceName("host_1234");
41+
* criteria.setDataSource(DataSource.EXTERNALVIEW);
42+
* criteria.setResource("%"); // Scans ALL ExternalViews!
43+
* </pre>
44+
*
45+
* <p><b>DataSource Selection:</b>
46+
* <ul>
47+
* <li><b>LIVEINSTANCES:</b> Use when targeting live instances without resource/partition filtering. Fastest.</li>
48+
* <li><b>EXTERNALVIEW:</b> Use when filtering by resource, partition, or replica state.
49+
* ALWAYS specify exact resource names.</li>
50+
* <li><b>INSTANCES:</b> Use for targeting all configured instances based on instance config.</li>
51+
* <li><b>IDEALSTATES:</b> Use for targeting based on ideal state. Less common.</li>
52+
* </ul>
53+
*
54+
* @see ClusterMessagingService#send(Criteria, org.apache.helix.model.Message)
2455
*/
2556
public class Criteria {
57+
/**
58+
* Source of cluster state data for resolving message recipients.
59+
*/
2660
public enum DataSource {
2761
IDEALSTATES,
2862
EXTERNALVIEW,
@@ -80,8 +114,12 @@ public DataSource getDataSource() {
80114
}
81115

82116
/**
83-
* Set the current source of truth
84-
* @param source ideal state or external view
117+
* Set the current source of truth for resolving message recipients.
118+
*
119+
* <p>Prefer {@link DataSource#LIVEINSTANCES} when not filtering by resource/partition.
120+
* If using {@link DataSource#EXTERNALVIEW}, specify exact resource names to avoid full scans.
121+
*
122+
* @param source ideal state, external view, live instances, or instances
85123
*/
86124
public void setDataSource(DataSource source) {
87125
_dataSource = source;
@@ -161,8 +199,12 @@ public String getResource() {
161199
}
162200

163201
/**
164-
* Set the destination resource name
165-
* @param resourceName the resource name or % for all resources
202+
* Set the destination resource name.
203+
*
204+
* <p>Only meaningful for {@link DataSource#EXTERNALVIEW} or {@link DataSource#IDEALSTATES}.
205+
* Using wildcard "%" with EXTERNALVIEW reads ALL ExternalView znodes - use exact names instead.
206+
*
207+
* @param resourceName the exact resource name, or "%" for all resources
166208
*/
167209
public void setResource(String resourceName) {
168210
this.resourceName = resourceName;

helix-core/src/main/java/org/apache/helix/HelixManager.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -409,6 +409,8 @@ void addExternalViewChangeListener(org.apache.helix.ExternalViewChangeListener l
409409

410410
/**
411411
* Messaging service which can be used to send cluster wide messages.
412+
* See {@link ClusterMessagingService#send(Criteria, org.apache.helix.model.Message)} for usage.
413+
*
412414
* @return messaging service
413415
*/
414416
ClusterMessagingService getMessagingService();

helix-core/src/main/java/org/apache/helix/messaging/CriteriaEvaluator.java

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,12 +39,51 @@
3939
import org.slf4j.Logger;
4040
import org.slf4j.LoggerFactory;
4141

42+
/**
43+
* Evaluates {@link Criteria} against persisted Helix data to determine message recipients.
44+
*
45+
* <p><b>PERFORMANCE WARNING:</b> When using {@link DataSource#EXTERNALVIEW}, this evaluator
46+
* will scan <b>all</b> ExternalView znodes in the cluster if the resource name is unspecified or uses wildcards
47+
* (e.g., "%" or "*"). This scanning happens <b>even when targeting specific instances</b>, and is
48+
* NOT automatically optimized based on other criteria fields (like instanceName).
49+
*
50+
* <p>At high ExternalView cardinality, this can cause severe performance degradation.
51+
*
52+
* <p><b>Safer Patterns:</b>
53+
* <ul>
54+
* <li><b>Use {@link DataSource#LIVEINSTANCES}:</b> When you only need to target live instances
55+
* and do not require resource/partition-level filtering. This reads only the LIVEINSTANCES
56+
* znodes, which is typically much smaller and faster.</li>
57+
* <li><b>Specify exact resource names:</b> If ExternalView is required, provide specific resource
58+
* names in {@link Criteria#setResource(String)} instead of wildcards to limit the scan scope.</li>
59+
* </ul>
60+
*
61+
* <p><b>Example - Targeting a specific instance:</b>
62+
* <pre>
63+
* // BAD: Scans all ExternalViews even though instance is specified
64+
* Criteria criteria = new Criteria();
65+
* criteria.setInstanceName("instance123");
66+
* criteria.setDataSource(DataSource.EXTERNALVIEW);
67+
* criteria.setResource("%"); // wildcard triggers full scan
68+
*
69+
* // GOOD: Uses LIVEINSTANCES, avoids ExternalView scan
70+
* Criteria criteria = new Criteria();
71+
* criteria.setInstanceName("instance123");
72+
* criteria.setDataSource(DataSource.LIVEINSTANCES);
73+
* </pre>
74+
*/
4275
public class CriteriaEvaluator {
4376
private static Logger logger = LoggerFactory.getLogger(CriteriaEvaluator.class);
4477
public static final String MATCH_ALL_SYM = "%";
4578

4679
/**
4780
* Examine persisted data to match wildcards in {@link Criteria}
81+
*
82+
* <p><b>PERFORMANCE WARNING:</b> Using {@link DataSource#EXTERNALVIEW} with wildcard resource
83+
* names (or unspecified resource) will scan ALL ExternalView znodes, even when targeting specific
84+
* instances. At high cardinality, this can cause severe performance degradation. Prefer
85+
* {@link DataSource#LIVEINSTANCES} when resource/partition filtering is not needed.
86+
*
4887
* @param recipientCriteria Criteria specifying the message destinations
4988
* @param manager connection to the persisted data
5089
* @return map of evaluated criteria
@@ -56,6 +95,12 @@ public List<Map<String, String>> evaluateCriteria(Criteria recipientCriteria,
5695

5796
/**
5897
* Examine persisted data to match wildcards in {@link Criteria}
98+
*
99+
* <p><b>PERFORMANCE WARNING:</b> Using {@link DataSource#EXTERNALVIEW} with wildcard resource
100+
* names (or unspecified resource) will scan ALL ExternalView znodes, even when targeting specific
101+
* instances. At high cardinality, this can cause severe performance degradation. Prefer
102+
* {@link DataSource#LIVEINSTANCES} when resource/partition filtering is not needed.
103+
*
59104
* @param recipientCriteria Criteria specifying the message destinations
60105
* @param accessor connection to the persisted data
61106
* @return map of evaluated criteria

helix-core/src/main/java/org/apache/helix/messaging/package-info.java

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,9 @@
1818
*/
1919

2020
/**
21-
* Helix message handling classes
22-
*
21+
* Helix message handling classes.
22+
*
23+
* <p>When using the messaging API, configure {@link org.apache.helix.Criteria} carefully
24+
* to avoid performance issues. See {@link org.apache.helix.ClusterMessagingService}.
2325
*/
2426
package org.apache.helix.messaging;

website/0.9.9/src/site/markdown/Features.md

Lines changed: 125 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -220,32 +220,134 @@ Since Helix is aware of the global state of the system, it can send the message
220220
This is a very generic api and can also be used to schedule various periodic tasks in the cluster like data backups etc.
221221
System Admins can also perform adhoc tasks like on demand backup or execute a system command(like rm -rf ;-)) across all nodes.
222222

223+
#### Understanding Criteria and DataSource
224+
225+
The `Criteria` object allows you to specify message recipients using various attributes. A critical configuration is the `DataSource`, which determines where Helix looks up the cluster state to resolve your criteria.
226+
227+
**Available DataSource Options:**
228+
229+
Helix provides four DataSource types, each reading from different znodes in ZooKeeper:
230+
231+
| DataSource | Description | When to Use |
232+
|------------|-------------|-------------|
233+
| **LIVEINSTANCES** | Reads from `/LIVEINSTANCES` znodes | Targeting live instances without needing resource/partition/state filtering |
234+
| **INSTANCES** | Reads from `/INSTANCES/[instance]` znodes | Targeting specific configured instances (live or not) based on instance configuration |
235+
| **EXTERNALVIEW** | Reads from `/EXTERNALVIEWS/[resource]` znodes | Targeting based on actual replica placement, partition ownership, or replica state (MASTER/SLAVE) |
236+
| **IDEALSTATES** | Reads from `/IDEALSTATES/[resource]` znodes | Targeting based on ideal state configuration (intended placement) |
237+
238+
**Key Differences:**
239+
240+
- **LIVEINSTANCES**: Contains only instance names of currently connected participants. No resource/partition information. Smallest dataset.
241+
- **INSTANCES**: Contains instance configuration (host, port, enabled/disabled status). No resource/partition information.
242+
- **EXTERNALVIEW**: Contains actual current state - which instances own which partitions and their states (MASTER/SLAVE/OFFLINE). Large dataset at scale.
243+
- **IDEALSTATES**: Contains desired state - which instances should own which partitions. Similar size to ExternalView.
244+
245+
**Choosing the Right DataSource:**
246+
247+
| Your Goal | Correct DataSource | Example Use Case |
248+
|-----------|-------------------|------------------|
249+
| Send to specific live instance(s) | `LIVEINSTANCES` | Health check, admin command to specific node |
250+
| Send to all live instances | `LIVEINSTANCES` | Broadcast announcement, cluster-wide operation |
251+
| Send to replicas of a specific partition | `EXTERNALVIEW` (with exact resource name) | Bootstrap replica from peers |
252+
| Send to all MASTER replicas of a resource | `EXTERNALVIEW` (with exact resource name) | Trigger operation on masters only |
253+
| Send based on partition state | `EXTERNALVIEW` (with exact resource name) | Target only ONLINE/MASTER/SLAVE replicas |
254+
255+
#### CRITICAL: Performance Considerations
256+
257+
**⚠️ WARNING:** Using `EXTERNALVIEW` as the DataSource can cause severe performance issues at scale.
258+
259+
**The Problem:**
260+
When using `DataSource.EXTERNALVIEW`, Helix will scan **ALL** ExternalView znodes in the cluster if:
261+
- You use wildcards (`%` or `*`) in the resource name, OR
262+
- You leave the resource name unspecified
263+
264+
**This happens even when targeting specific instances!** The scan is NOT automatically optimized based on other criteria fields like `instanceName`.
265+
266+
At high ExternalView cardinality, this can cause severe performance degradation.
267+
268+
#### How to Set Criteria Correctly
269+
270+
**Pattern 1: Targeting Specific Instances (Most Common)**
271+
272+
When you only need to send messages to specific instances and don't need resource/partition-level filtering:
273+
274+
```java
275+
// GOOD: Efficient - Uses LIVEINSTANCES, avoids ExternalView scan
276+
Criteria criteria = new Criteria();
277+
criteria.setInstanceName("instance123"); // or "%" for all live instances
278+
criteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
279+
criteria.setDataSource(DataSource.LIVEINSTANCES); // Key: Use LIVEINSTANCES
280+
criteria.setSessionSpecific(true);
281+
```
282+
283+
```java
284+
// BAD: Inefficient - Scans ALL ExternalViews even though targeting specific instance
285+
Criteria criteria = new Criteria();
286+
criteria.setInstanceName("instance123");
287+
criteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
288+
criteria.setDataSource(DataSource.EXTERNALVIEW); // Will scan ALL resources!
289+
criteria.setResource("%"); // Wildcard triggers full scan
290+
```
291+
292+
**Pattern 2: Targeting Specific Resource and Partition**
293+
294+
When you need to send messages based on resource ownership (e.g., all replicas of a partition):
295+
296+
```java
297+
// GOOD: Efficient - Specifies exact resource name, scans only 1 ExternalView
298+
Criteria criteria = new Criteria();
299+
criteria.setInstanceName("%");
300+
criteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
301+
criteria.setDataSource(DataSource.EXTERNALVIEW);
302+
criteria.setResource("MyDB"); // Exact resource name - scans only this EV
303+
criteria.setPartition("MyDB_0"); // Specific partition
304+
criteria.setPartitionState("MASTER"); // Only send to masters
305+
criteria.setSessionSpecific(true);
306+
```
307+
308+
```java
309+
// BAD: Inefficient - Wildcard resource scans ALL ExternalViews
310+
Criteria criteria = new Criteria();
311+
criteria.setInstanceName("%");
312+
criteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
313+
criteria.setDataSource(DataSource.EXTERNALVIEW);
314+
criteria.setResource("%"); // Wildcard scans ALL ExternalViews in cluster!
315+
criteria.setPartition("MyDB_0");
316+
criteria.setSessionSpecific(true);
223317
```
224-
ClusterMessagingService messagingService = manager.getMessagingService();
225-
//CONSTRUCT THE MESSAGE
226-
Message requestBackupUriRequest = new Message(
227-
MessageType.USER_DEFINE_MSG, UUID.randomUUID().toString());
228-
requestBackupUriRequest
229-
.setMsgSubType(BootstrapProcess.REQUEST_BOOTSTRAP_URL);
230-
requestBackupUriRequest.setMsgState(MessageState.NEW);
231-
//SET THE RECIPIENT CRITERIA, All nodes that satisfy the criteria will receive the message
232-
Criteria recipientCriteria = new Criteria();
233-
recipientCriteria.setInstanceName("%");
234-
recipientCriteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
235-
recipientCriteria.setResource("MyDB");
236-
recipientCriteria.setPartition("");
237-
//Should be processed only the process that is active at the time of sending the message.
238-
//This means if the recipient is restarted after message is sent, it will not be processed.
239-
recipientCriteria.setSessionSpecific(true);
240-
// wait for 30 seconds
241-
int timeout = 30000;
242-
//The handler that will be invoked when any recipient responds to the message.
243-
BootstrapReplyHandler responseHandler = new BootstrapReplyHandler();
244-
//This will return only after all recipients respond or after timeout.
245-
int sentMessageCount = messagingService.sendAndWait(recipientCriteria,
246-
requestBackupUriRequest, responseHandler, timeout);
318+
319+
**Pattern 3: Broadcasting to All Live Instances**
320+
321+
```java
322+
// GOOD: Efficient broadcast to all live participants
323+
Criteria criteria = new Criteria();
324+
criteria.setInstanceName("%");
325+
criteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
326+
criteria.setDataSource(DataSource.LIVEINSTANCES); // Fast broadcast
327+
criteria.setSessionSpecific(true);
247328
```
248329

330+
#### Criteria Configuration Reference
331+
332+
The `Criteria` class provides the following configuration methods:
333+
334+
| Method | Parameter | Description | Wildcard Support | Example |
335+
|--------|-----------|-------------|------------------|---------|
336+
| `setDataSource(DataSource)` | LIVEINSTANCES, INSTANCES, EXTERNALVIEW, IDEALSTATES | **MOST IMPORTANT:** Determines which znodes to read | N/A | `DataSource.LIVEINSTANCES` |
337+
| `setInstanceName(String)` | Instance name | Target specific instance(s) by name | Yes (`%` = all) | `"localhost_12918"` or `"%"` |
338+
| `setResource(String)` | Resource name | Filter by resource name (only meaningful for EXTERNALVIEW/IDEALSTATES) | Yes (`%` = all) | `"MyDatabase"` or `"%"` |
339+
| `setPartition(String)` | Partition name | Filter by specific partition (only meaningful for EXTERNALVIEW/IDEALSTATES) | Yes (`%` = all) | `"MyDatabase_0"` or `"%"` |
340+
| `setPartitionState(String)` | State name | Filter by replica state like MASTER, SLAVE, ONLINE, OFFLINE (only for EXTERNALVIEW/IDEALSTATES) | Yes (`%` = all) | `"MASTER"` or `"%"` |
341+
| `setRecipientInstanceType(InstanceType)` | PARTICIPANT, CONTROLLER, SPECTATOR | Type of Helix process to target | No | `InstanceType.PARTICIPANT` |
342+
| `setSessionSpecific(boolean)` | true/false | If true, message is only delivered to currently active sessions (not redelivered after restart) | No | `true` (recommended) |
343+
344+
**Important Notes:**
345+
346+
- **Wildcards:** Use `%` (SQL-style) or `*` to match all. Single underscore `_` matches any single character.
347+
- **DataSource Compatibility:** Setting `resource`, `partition`, or `partitionState` only makes sense with `EXTERNALVIEW` or `IDEALSTATES` DataSource. They are ignored for `LIVEINSTANCES` and `INSTANCES`.
348+
- **Session-Specific:** Set to `true` for most use cases to avoid redelivering messages after a participant restarts.
349+
- **Empty vs Wildcard:** Empty string `""` and wildcard `"%"` are treated the same - both match all.
350+
249351
See HelixManager.getMessagingService for more info.
250352

251353

website/0.9.9/src/site/markdown/tutorial_messaging.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,26 @@ under the License.
2525

2626
In this chapter, we\'ll learn about messaging, a convenient feature in Helix for sending messages between nodes of a cluster. This is an interesting feature that is quite useful in practice. It is common that nodes in a distributed system require a mechanism to interact with each other.
2727

28+
### Performance Considerations
29+
30+
**IMPORTANT:** When using the messaging API with `Criteria`, be aware of the following performance characteristics:
31+
32+
- **ExternalView Scanning:** By default, the messaging service uses `DataSource.EXTERNALVIEW` to resolve criteria. This can scan **all** ExternalView znodes in the cluster, even when targeting specific instances. At high resource cardinality, this can cause severe performance degradation.
33+
34+
**Recommended Patterns:**
35+
36+
- **Use `DataSource.LIVEINSTANCES`** when you only need to target live instances and do not require resource/partition-level filtering. This is much faster and more efficient.
37+
- **Specify exact resource names** instead of wildcards if you must use ExternalView scanning.
38+
39+
Example of efficient messaging:
40+
```java
41+
Criteria recipientCriteria = new Criteria();
42+
recipientCriteria.setInstanceName("instance123");
43+
recipientCriteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
44+
recipientCriteria.setDataSource(DataSource.LIVEINSTANCES); // Efficient: avoids EV scan
45+
recipientCriteria.setSessionSpecific(true);
46+
```
47+
2848
### Example: Bootstrapping a Replica
2949

3050
Consider a search system where the index replica starts up and it does not have an index. A typical solution is to get the index from a common location, or to copy the index from another replica.

0 commit comments

Comments
 (0)