-
Notifications
You must be signed in to change notification settings - Fork 298
DEV: (cmds+) updates for streams idempotent production #2637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feat-ros-8.6
Are you sure you want to change the base?
Conversation
| Configure idempotency settings for a stream using [`XCFGSET`]({{< relref "/commands/xcfgset" >}}): | ||
|
|
||
| ``` | ||
| XCFGSET mystream DURATION 300 MAXSIZE 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameter names were changed yesterday to IDMP-DURATION and IDMP-MAXSIZE.
Sorry for that.
| Enables idempotent message processing (at-most-once production) to prevent duplicate entries. Available since Redis 8.6. | ||
|
|
||
| - `IDMPAUTO producer-id`: Automatically generates a unique idempotent ID (iid) for the specified producer-id. Redis tracks this iid to prevent duplicate messages from the same producer-id. | ||
| - `IDMP producer-id idempotent-id`: Uses the specified idempotent-id for the given producer-id. If this producer-id/idempotent-id combination was already used, the command returns the idempotent-id of the existing entry instead of creating a duplicate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the command returns the ID of the previously-added entry (like you mentioned below)
Not the idempotent-id.
| - **ERR no such key**: The stream does not exist | ||
| - **ERR syntax error**: Invalid command syntax or missing required arguments | ||
| - **ERR invalid duration**: Duration value is outside the valid range (1-86400) | ||
| - **ERR invalid maxsize**: Maxsize value is outside the valid range (1-10,000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: "10,000" - different format from "64000"
| weight: 10 | ||
| --- | ||
|
|
||
| In Redis 8.6, streams support idempotent message processing (at-most-once production) to prevent duplicate entries when using at-least-once delivery patterns. This feature enables reliable message submission with automatic deduplication. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idempotent message processing ensures that handling the same message multiple times produces the same system state as handling it once.
Beginning with Redis 8.6, streams support idempotent message processing (at-most-once production) to prevent duplicate entries when producers resend messages.
Producers may need to resend messages under two scenarios:
-
Producer-Redis network issues (i.e., disconnection and reconnection)
If a disconnection occurs after the producer executes
XADD, but before it receives the reply, the producer has no way of knowing if that message was delivered to Redis. -
The producer crashes and restarts
If the producer crashes after calling
XADDbut before receiving the reply and marking a message as “delivered”, after a restart, the producer has no way of knowing if that message was delivered to Redis.
In both cases, to guarantee that the message is added to the stream, the producer must call XADD again with the same message. Without idempotent message processing, a retry may result in a message being delivered twice. With idempotent message processing, producers can guarantee at-most-once production even under such scenarios.
| Use the [`XADD`]({{< relref "/commands/xadd" >}}) command with idempotency parameters, `IDMP` or `IDMPAUTO`: | ||
|
|
||
| ``` | ||
| XADD mystream IDMP producer1 msg1 * field value # producer-provided iid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it is clear that producer1 is the pid and msg1 is the iid.
| XADD mystream IDMPAUTO producer1 * field value | ||
| ``` | ||
|
|
||
| - `pid`: Unique identifier for the message producer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should stress that
For both IDMP and IDMPAUTO:
- Each producer app is required to use the same pid after it restarts.
For IDMP:
- Each producer app is responsible for
- Providing a unique iid for each entry (either globally, or just for each pid)
- Reusing the same (pid, iid) when resending a message (even after it restarts)
| XADD mystream IDMP producer2 msg1 * field value # Producer 2's tracking (independent) | ||
| ``` | ||
|
|
||
| Producers can use the same idempotent ID without conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as long as they use different pids.
| Idempotent IDs are removed when either condition is met: | ||
|
|
||
| - Time-based: iids expire after the configured `DURATION`. | ||
| - Capacity-based: Oldest iids are evicted when `MAXSIZE` is reached. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is important to clarify that Redis never keeps more than maxsize iids per pid (in other words, MAXSIZE is "stronger" than DURATION).
|
|
||
| - `idmp-duration`: Current duration setting. | ||
| - `idmp-maxsize`: Current maxsize setting. | ||
| - `pids-tracked`: Number of active producers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid "active" as it is ambiguous.
I'd replace "active producers" with "the number of pids currently tracked in the stream".
(because an app may use the same pid for multiple producers, though it is wrong to do so).
| - `idmp-duration`: Current duration setting. | ||
| - `idmp-maxsize`: Current maxsize setting. | ||
| - `pids-tracked`: Number of active producers. | ||
| - `iids-tracked`: Total idempotent IDs currently stored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stored --> tracked.
| - `idmp-maxsize`: Current maxsize setting. | ||
| - `pids-tracked`: Number of active producers. | ||
| - `iids-tracked`: Total idempotent IDs currently stored. | ||
| - `iids-added`: Lifetime count of idempotent messages added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idempotent messages --> messages with idempotent ids.
| Use globally unique, persistent producer IDs: | ||
|
|
||
| - Recommended: UUID v4 for global uniqueness. | ||
| - Alternative: `hostname:process_id` or application-assigned IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is a good idea. The process_id will likely change after a restart.
Is it a ChatGPT suggestion?
|
|
||
| Use globally unique, persistent producer IDs: | ||
|
|
||
| - Recommended: UUID v4 for global uniqueness. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UUID for a producer ID is usually overkill.
Apps should use shorter pids (if possible). This would save memory and increase performance.
|
|
||
| - RDB/AOF: All producer-idempotent ID pairs are saved. | ||
| - Recovery: Tracking remains active after restart. | ||
| - Configuration: `DURATION` and `MAXSIZE` settings persist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDMP-DURATION, IDMP-MAXSIZE
| - RDB/AOF: All producer-idempotent ID pairs are saved. | ||
| - Recovery: Tracking remains active after restart. | ||
| - Configuration: `DURATION` and `MAXSIZE` settings persist. | ||
| - Important: Running `XCFGSET` clears all existing tracking data. Use without `DURATION` and `MAXSIZE` to clear data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so.
The requirement was: Calling XCFGSET key with different duration or maxsize values than the current, clears the IDMP map for the given key.
So if the values are the same - do nothing.
I'm not sure if you can call it without any values. And even if you can - maybe it uses the default values?
| XRANGE mystream - + | ||
| {{% /redis-cli %}} | ||
|
|
||
| ### Idempotent message processing examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is mentioned:
IDMPAUTO pid and IDMP pid iid can be specified only when the entry id is *.
andy-stark-redis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tiny things to check out, but otherwise language LGTM.
| You should know how long it may take your producer application to recover from a crash and start resending messages. | ||
| `DURATION` should be set accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid PV in the otherwise AV sentence? (Not major)
| You should know how long it may take your producer application to recover from a crash and start resending messages. | |
| `DURATION` should be set accordingly. | |
| You should know how long it may take your producer application to recover from a crash and start resending messages, | |
| so you should set `DURATION` accordingly. |
| XINFO STREAM mystream | ||
| ``` | ||
|
|
||
| Returns additional fields when idempotency is begin used: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Returns additional fields when idempotency is begin used: | |
| Returns additional fields when idempotency is being used: |
| - Memory: <1.5% additional memory usage. | ||
| - Latency: Negligible impact on per-operation latency. | ||
|
|
||
| Manual mode (IDMP) is slightly faster than automatic mode (IDMPAUTO) due to avoiding hash calculation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a bit neater?
| Manual mode (IDMP) is slightly faster than automatic mode (IDMPAUTO) due to avoiding hash calculation. | |
| Manual mode (IDMP) is slightly faster than automatic mode (IDMPAUTO) since it avoids hash calculation. |
|
|
||
| Here's an illustration of how message processing in Redis Streams works with and without idempotent production: | ||
|
|
||
| {{< image filename="images/dev/stream/stream-idempotency.png" alt="Idempotent message processing in Redis Streams" >}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could probably be done as a Mermaid diagram (more AI-friendly) but it's not essential - maybe getting this published quickly is the top priority for now!
No description provided.