-
Notifications
You must be signed in to change notification settings - Fork 46
AIT-221: Document how token streaming interacts with rate limits #3092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: AIT-129-AIT-Docs-release-branch
Are you sure you want to change the base?
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
src/pages/docs/ai-transport/features/token-streaming/token-rate-limits.mdx
Outdated
Show resolved
Hide resolved
| 2. As the token rate approaches a threshold percentage of the [connection inbound message rate](/docs/platform/pricing/limits#connection), Ably batches tokens together automatically | ||
| 3. Clients receive the same number of tokens per second, delivered in fewer messages | ||
|
|
||
| By default, a single response stream uses up to 50% of the connection inbound message rate. This allows two simultaneous response streams on the same channel or connection. [Contact Ably](/contact) to adjust this threshold if your application requires a different allocation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've challenged this (https://ably.atlassian.net/wiki/spaces/AI/pages/4624515073/AITDR-003+Append+batching+in+the+frontdoor?focusedCommentId=4667342849)
Should this mention that it's specifiable via transport param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the text to match what is now in the DR. I'm deliberately not including a time value for the batching, because "we add 40ms latency to your token delivery time" sounds more negative than "we'll deliver your tokens in 25 messages/s so you don't hit your rate limits".
I wasn't sure if we'd have completed any client updates and docs for the transport param before we were ready to release these docs, so left the customer-configurable part out. Can update with this info and associated docs link if @SimonWoolf is confident that it will be done, or add it when the transport param docs go out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if we'd have completed any client updates and docs for the transport param before we were ready to release these docs
Right now we don't have a single central 'transport params docs' page with a list of what transportParams exist -- we figured it would be a grab-bag of random unrelated things, which didn't seem useful. So we just mention them in the parts of the documentation where it's actually relevant, e.g. remainPresentFor is mentioned in https://ably.com/docs/presence-occupancy/presence#unstable-connections. And for some we just don't document them at all unless someone comes and presents us with a problem for which it's the appropriate solution, like rewindOnFailedResume.
In this case there's no bit of documentation more relevant than this page for this option, so I'd suggest just putting it here.
Right now it's named appendRollupWindow, lmk if you think a different name would be better. As with all our time options it's specified in milliseconds as an integer, and is capped at 500ms.
Description
Document how Ably message rate limits interact with applications that are streaming tokens, for both the message-per-token and message-per-response streaming patterns.
https://ably.atlassian.net/browse/AIT-221
Checklist