-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
enhancementNew feature or requestNew feature or request
Description
There are some cases where the result of an invokation is very long and exceeds the token window limit
In those cases we can implement some type of chunking mechanism that processes the input in batches that the LLM can handle.
For some reference see this issue and specifically this comment
The idea would be to apply the following new steps when sending a message to the agent:
- Create an env variable to set the chunking size
- When the agent is about to process any input, check the length of the input with tiktoken
- If the input is larger than the context window, chunk it based on env config length.
- Save the output in a variable
- Process the next chunk and append the result to the previous variable to update its contents with the new information.
- Repeat the process until you´ve processed all the chunks
- Return the combination of all the responses to the chunked bits
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request