Skip to content

Conversation

@guill
Copy link
Contributor

@guill guill commented Jun 2, 2025

When files are very large, returning the entire file can quickly blow out the context, even with a small number of results returned. This PR adds the option (specified in tool_opts) to only return chunks to the CodeCompanion LLM -- only_chunks. This allows the LLM to decide whether to request the entire file based on the chunk and whether it actually seems relevant to the question.

I've only added this to the CodeCompanion backend for now (just because I don't have CopilotChat configured to test it).

@codecov
Copy link

codecov bot commented Jun 2, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.28%. Comparing base (c186db0) to head (8aa380b).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #168   +/-   ##
=======================================
  Coverage   99.28%   99.28%           
=======================================
  Files          22       22           
  Lines        1534     1534           
=======================================
  Hits         1523     1523           
  Misses         11       11           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Davidyz
Copy link
Owner

Davidyz commented Jun 2, 2025

Hi, thanks for this PR! I haven't done this because of #146 . I'll get back to this once I fix that.

@Davidyz Davidyz added enhancement New feature or request feature labels Jun 2, 2025
@Davidyz
Copy link
Owner

Davidyz commented Jun 3, 2025

While we're at it, do you think it makes sense to give the option to the LLM? We could make this a boolean parameter that the LLM can call, so that it can decide whether it wants chunks or full documents. The downside of it is that the LLM doesn't know the project before the func call, so its decisions might be bad and unreliable.

@guill
Copy link
Contributor Author

guill commented Jun 4, 2025

Yeah, I don't think it makes sense to expose that directly to an LLM -- whether it makes sense to return the chunk or the document is really going to be dependent on the size of the full document responses. It's not something I would expect the LLM to have enough context to make an intelligent decision on.
I suppose we could have a max_response_size argument provided by the LLM and automatically switch from document response to chunks if the total document response is over that size. I'm not sure how self-aware most LLMs are about their own context sizes though. Maybe the max_response_size should be what's exposed through the plugin opts though? 🤔

@Davidyz
Copy link
Owner

Davidyz commented Jun 4, 2025

I suppose we could have a max_response_size argument provided by the LLM and automatically switch from document response to chunks if the total document response is over that size.

We might be able to utilise codecompanion's token count feature for this. The problem is how we can determine the max context window so that the token count actually makes sense (for example, a 100k-token conversation is close to being saturated for a 128k LLM, but is still very usable for a 1M LLM).

I also have this thought about using an auxiliary LLM as a response-rewriter that paraphrases the document content into concise, descriptive paragraphs as the response. Paraphrasing, compared to coding, is a simpler task and can hopefully be handled well enough by a small, cheap (or free) LLM. If it works, it could save the token count for the main LLM, allowing users to have longer conversations with the main coder LLM, and hopefully reduce the cost as well.

@Davidyz Davidyz force-pushed the nvim_return_chunks_pr branch from 53a5390 to 293dac5 Compare June 4, 2025 09:32
@Davidyz Davidyz force-pushed the nvim_return_chunks_pr branch from 293dac5 to 8abdfc2 Compare June 5, 2025 04:18
Copy link
Owner

@Davidyz Davidyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the comments in the changes, I think we should also make max_num and default_num tables (something like {document=10, chunk=100}), so that users can set different default_num and max_num for document and chunk mode. The chunk length and document length can differ by a lot, so it makes sense to set different numbers for them.

Also, to make sure it's backward-compatible, we should add a check that converts the old format of the config to the new format, and throw a warning via vim.deprecate (if you're not familiar with the syntax, this is how I do it). I'll remove the backward-compatibility stuff before making 0.7.0 release.

local args = { "query" }
vim.list_extend(args, action.options.query)
vim.list_extend(args, { "--pipe", "-n", tostring(action.options.count) })
vim.list_extend(args, { "--include", "path", "chunk", "document" })
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chunk and document options in the --include flag are exclusive. You can have only one of them. path can stay there.

auto_submit = { ls = false, query = false },
ls_on_start = false,
no_duplicate = true,
only_chunks = false,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the chunk and document options can never co-exist, maybe rename this to chunk_mode?

@Davidyz
Copy link
Owner

Davidyz commented Jun 10, 2025

I'll checkout from this branch to refactor some of the code to prepare the codebase for #179 . All your commits will stay. I'll also make the changes that I mentioned here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants