-
Notifications
You must be signed in to change notification settings - Fork 45
AI Transport: Add a guide for token streaming using the OpenAI SDK #3024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: AIT-129-AIT-Docs-release-branch
Are you sure you want to change the base?
AI Transport: Add a guide for token streaming using the OpenAI SDK #3024
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
deae500 to
3d12fdf
Compare
|
|
||
| You should see publisher output similar to the following: | ||
|
|
||
| ```text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GregHolmes is there some sort of collapsible component that I can use for this (like a <details> I guess)? I'd like to include this output but not force the the user to scroll through it all if just skimming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @lawrence-forooghian @rainbowFi , I'm afraid at this moment I don't think there is a collapsible component. I'm not entirely sure we need all of that output though. Could we not to a start, middle, and finish part. So it's cropped like:
f03945a: Created stream
f03945a: Got event response.created
f03945a: Publishing 'start' event for response resp_097628d5ede953e800693c497c30148194adc300e4ee412171
f03945a: Got event response.in_progress
f03945a: Ignoring OpenAI SDK event response.in_progress for response resp_097628d5ede953e800693c497c30148194adc300e4ee412171
de6fbd3: Created stream
de6fbd3: Got event response.created
de6fbd3: Publishing 'start' event for response resp_0f89f403f4f5f71800693c497c319c8195acdf3676dbe32cf5
de6fbd3: Got event response.in_progress
de6fbd3: Ignoring OpenAI SDK event response.in_progress for response resp_0f89f403f4f5f71800693c497c319c8195acdf3676dbe32cf5
... Rest of log
f03945a: Ignoring OpenAI SDK event response.output_text.done for response resp_097628d5ede953e800693c497c30148194adc300e4ee412171
f03945a: Got event response.content_part.done
f03945a: Ignoring OpenAI SDK event response.content_part.done for response resp_097628d5ede953e800693c497c30148194adc300e4ee412171
f03945a: Got event response.output_item.done
f03945a: Ignoring OpenAI SDK event response.output_item.done for response resp_097628d5ede953e800693c497c30148194adc300e4ee412171
f03945a: Got event response.completed
f03945a: Publishing 'stop' event for response resp_097628d5ede953e800693c497c30148194adc300e4ee412171
Same would go for the other lengthy snippets?
| This is only a representative example for a simple "text in, text out" use case, and may not reflect the exact sequence of events that you'll observe from the OpenAI API. It also does not handle response generation errors or refusals. | ||
| </Aside> | ||
|
|
||
| 1. `{ type: 'response.created', response: { id: 'resp_abc123', … }, … }`: This gives us the response ID, which we'll include in all messages that we publish to Ably for this response. When we receive this event, we'll publish an Ably message named `start`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GregHolmes this list, and these JSON events, came out looking a bit cramped and ugly — any suggestions on what I could do to improve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having all the braces makes this look more complicated than it is and not helping the spacing. If you put it into a table, you could have a column with the heading type for the OpenAI messages and just put response.in_progress or response.output_item.added, then a column for any other interesting fields and a column with the Ably message mapping...?
Alternatively, keep the list format but just put the type name into the text and highlight important fields separately e.g.
3. response.output_item.added with item.type = 'reasoning': we'll ignore this event since we're only interested in messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about this, the more I think we should skip this list. I love the technical detail, but it isn't reflected in the code (because we're handling only the specific events we need) and it doesn't aid understanding of Ably. So, I would suggest making the "Understanding Responses API events" section into a clear description of the flow of relevant events we get back from OpenAI, then have a table showing how we'll map those relevant messages to Ably events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to keep it, why couldn't we just format the list so you have:
1 - Description
// Prettied JSON2 - Description
etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, 11 point list seems a bit overwhelming. Is it worth breaking this down into a table such as:
OpenAIEvent and Action columns.
Do we then need json or instead the first column would have the type. response.in_progress, second will be the action. Ignore. etc?
3d12fdf to
9876218
Compare
9876218 to
281713a
Compare
281713a to
4f188f3
Compare
0e663af to
52e32d8
Compare
| **Software requirements:** | ||
| - Node.js 20 or higher | ||
|
|
||
| We'll be using the OpenAPI SDK v4.x. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - I wouldn't put this in pre-reqs, because you're going to cover installing it below. Instead, I'd mention the version inline right before the install snippet at line 43
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure — I've added the following in the installing section (the only reason to mention the version number was to highlight that we aren't covering all possible OpenAI SDK versions):
<Aside data-type="note">
We're using version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may diverge from those given here if you're using a different major version.
</Aside>
| This is only a representative example for a simple "text in, text out" use case, and may not reflect the exact sequence of events that you'll observe from the OpenAI API. It also does not handle response generation errors or refusals. | ||
| </Aside> | ||
|
|
||
| 1. `{ type: 'response.created', response: { id: 'resp_abc123', … }, … }`: This gives us the response ID, which we'll include in all messages that we publish to Ably for this response. When we receive this event, we'll publish an Ably message named `start`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having all the braces makes this look more complicated than it is and not helping the spacing. If you put it into a table, you could have a column with the heading type for the OpenAI messages and just put response.in_progress or response.output_item.added, then a column for any other interesting fields and a column with the Ably message mapping...?
Alternatively, keep the list format but just put the type name into the text and highlight important fields separately e.g.
3. response.output_item.added with item.type = 'reasoning': we'll ignore this event since we're only interested in messages
| This is only a representative example for a simple "text in, text out" use case, and may not reflect the exact sequence of events that you'll observe from the OpenAI API. It also does not handle response generation errors or refusals. | ||
| </Aside> | ||
|
|
||
| 1. `{ type: 'response.created', response: { id: 'resp_abc123', … }, … }`: This gives us the response ID, which we'll include in all messages that we publish to Ably for this response. When we receive this event, we'll publish an Ably message named `start`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about this, the more I think we should skip this list. I love the technical detail, but it isn't reflected in the code (because we're handling only the specific events we need) and it doesn't aid understanding of Ably. So, I would suggest making the "Understanding Responses API events" section into a clear description of the flow of relevant events we get back from OpenAI, then have a table showing how we'll map those relevant messages to Ably events.
| **Key points**: | ||
| - **Multiple concurrent responses are handled correctly**: The subscriber receives interleaved tokens for three concurrent AI responses, and correctly pieces together the three separate messages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never saw interleaved responses when I ran this script, any idea why? Luck, location, something else? It's not particularly important but I just want to make sure there isn't a change in the example code that is causing the behaviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was only seeing them intermittently when I first wrote it, and now I'm unable to get any at all, too! I think it would be good if users could observe this behaviour. One option would be to add small random delays before processing each event, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It's not ideal and distracts from the content of the guide)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively we could spin up two publisher instances at the same time: node publisher.mjs & node publisher.mjs — on testing this locally it seems to more reliably give interleaved events. But again it complicates the guide.
4f188f3 to
cabebd0
Compare
|
@rainbowFi I've updated the publisher to add an additional prompt ("Write a one-line poem about carrot cake"); missed this out of the original code but it was used when generating the responses shown here |
This guide provides a concrete example of how to implement the message-per-token pattern that Mike documented (the one linked to in this guide). I initially got Claude to generate this but replaced a fair chunk of its output. I trusted that its prose is consistent with our tone of voice and AI Transport marketing position (whether mine is, I have no idea) and in general trusted its judgements about how to structure the document. I would definitely welcome opinions on all of the above, especially from those familiar with how we usually write docs. I have tried to avoid repeating too much content from the message-per-token page and have in particular not tried to give an example of hydration since it seems like a provider-agnostic concept.
cabebd0 to
f827802
Compare
|
I haven't yet fully gone through this PR. But I have noticed that AI really enjoys boldening things. It's not something our docs really do though. |
|
|
||
| ## Understanding Responses API events <a id="understanding-events"/> | ||
|
|
||
| OpenAI's Responses API streams model output as a series of events when you set `stream: true` ([OpenAI - streaming API responses](https://platform.openai.com/docs/guides/streaming-responses)). Each streaming event includes a `type`, which describes the event type, and a full output response is made up of the following events: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GregHolmes Ideally I want the OpenAI link to open in a new page, but I can't seem to find a markdown format that Gatsby will accept. Any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently there is no way of doing this with Markdown. I think the belief is that you should know how to open the link in a new tab if you want it in a new tab and we shouldn't force people to do it.
GregHolmes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few comments.
But also, just one other thing, AI seems to like making text bold where we don't tend to do that within the docs. I'd suggest removing the bold part,such as:
1. **Stream start**: First event arrives with response ID
2. **Content deltas**: Multiple `response.output_text.delta` events with incremental text
3. **Stream completion**: Stream ends when all events have been received```
to
The stream lifecycle typically follows this pattern:
- Stream start: First event arrives with response ID
- Content deltas: Multiple
response.output_text.deltaevents with incremental text - Stream completion: Stream ends when all events have been received
|
|
||
| To follow this guide, you'll need: | ||
|
|
||
| **Software requirements:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to break up software and account requirements.
Can just be the same list.
| - An OpenAI API key | ||
| - An Ably API key | ||
|
|
||
| **Useful links:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be better suited at the bottom of the page. A further reading, or within next steps.
Is the quickstart at openai relevant? or would it be better to link to specific feature pages that we cover too.
| - [OpenAPI developer quickstart](https://platform.openai.com/docs/quickstart) | ||
| - [Ably JavaScript SDK getting started](/docs/getting-started/javascript) | ||
|
|
||
| ## Setting up your environment <a id="setup"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is usually under the prerequisites. For example here: https://ably.com/docs/chat/getting-started/javascript (I know we're not writing a getting started guide but it follows similar suit.
| </Code> | ||
|
|
||
| <Aside data-type="note"> | ||
| We're using version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may diverge from those given here if you're using a different major version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| We're using version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may diverge from those given here if you're using a different major version. | |
| This guide uses version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may diverge from those given here if using a different major version. |
| We're using version 4.x of the OpenAI SDK. Some details of interacting with the OpenAI SDK may diverge from those given here if you're using a different major version. | ||
| </Aside> | ||
|
|
||
| Export your OpenAPI key to the environment; the OpenAI SDK will automatically read it later: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Export your OpenAPI key to the environment; the OpenAI SDK will automatically read it later: | |
| Export your OpenAI API key to the environment which will be used later in the guide by the OpenAI SDK: |
| - [OpenAPI developer quickstart](https://platform.openai.com/docs/quickstart) | ||
| - [Ably JavaScript SDK getting started](/docs/getting-started/javascript) | ||
|
|
||
| ## Setting up your environment <a id="setup"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have 2 of the same anchor in the page: <a id="setup"/>
one of these will need to be changed.
| - An Ably API key | ||
|
|
||
| **Useful links:** | ||
| - [OpenAPI developer quickstart](https://platform.openai.com/docs/quickstart) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - [OpenAPI developer quickstart](https://platform.openai.com/docs/quickstart) | |
| - [OpenAI developer quickstart](https://platform.openai.com/docs/quickstart) |
| This is only a representative example for a simple "text in, text out" use case, and may not reflect the exact sequence of events that you'll observe from the OpenAI API. It also does not handle response generation errors or refusals. | ||
| </Aside> | ||
|
|
||
| 1. `{ type: 'response.created', response: { id: 'resp_abc123', … }, … }`: This gives us the response ID, which we'll include in all messages that we publish to Ably for this response. When we receive this event, we'll publish an Ably message named `start`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, 11 point list seems a bit overwhelming. Is it worth breaking this down into a table such as:
OpenAIEvent and Action columns.
Do we then need json or instead the first column would have the type. response.in_progress, second will be the action. Ignore. etc?
|
|
||
| **Important implementation notes:** | ||
|
|
||
| - **Don't await publish calls**: As shown in the code above, `channel.publish()` is called without `await`. This maximizes throughput by allowing Ably to batch acknowledgments. Messages are still published in order. For more details, see [publishing tokens](/docs/ai-transport/features/token-streaming/message-per-token#publishing) in the message-per-token guide. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this too late in the page to be pointing this out? for example we could do Aside's for each part in key parts of the page?
|
|
||
| ## Setting up your environment <a id="setup"/> | ||
|
|
||
| Create a new NPM package. This will contain our publisher and subscriber code: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Create a new NPM package. This will contain our publisher and subscriber code: | |
| Create a new NPM package, which will contain the publisher and subscriber code: |
Description
This guide provides a concrete example of how to implement the message-per-token pattern that Mike documented in #3014.
I initially got Claude to generate this but replaced a fair chunk of its output. I trusted that its prose is consistent with our tone of voice and AI Transport marketing position (whether mine is, I have no idea) and in general trusted its judgements about how to structure the document. I would definitely welcome opinions on all of the above, especially from those familiar with how we usually write docs.
I have tried to avoid repeating too much content from the message-per-token page and have in particular not tried to give an example of hydration since it seems like a provider-agnostic concept.
Checklist