Skip to content

Feat: Chunking large inputs #28

@rihp

Description

@rihp

There are some cases where the result of an invokation is very long and exceeds the token window limit

In those cases we can implement some type of chunking mechanism that processes the input in batches that the LLM can handle.

For some reference see this issue and specifically this comment

The idea would be to apply the following new steps when sending a message to the agent:

  1. Create an env variable to set the chunking size
  2. When the agent is about to process any input, check the length of the input with tiktoken
  3. If the input is larger than the context window, chunk it based on env config length.
  4. Save the output in a variable
  5. Process the next chunk and append the result to the previous variable to update its contents with the new information.
  6. Repeat the process until you´ve processed all the chunks
  7. Return the combination of all the responses to the chunked bits

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions