Davidyz · Davidyz · Jun 20, 2025 · Jun 20, 2025 · Jun 20, 2025 · Jun 20, 2025
diff --git a/README.md b/README.md
@@ -7,18 +7,13 @@
 VectorCode is a code repository indexing tool. It helps you build better prompt
 for your coding LLMs by indexing and providing information about the code
 repository you're working on. This repository also contains the corresponding
-neovim plugin because that's what I used to write this tool.
+neovim plugin that provides a set of APIs for you to build or enhance AI plugins,
+and integrations for some of the popular plugins.
 
 > [!NOTE]
 > This project is in beta quality and is undergoing rapid iterations.
 > I know there are plenty of rooms for improvements, and any help is welcomed.
 
-> [!NOTE]
-> [Chromadb](https://www.trychroma.com/), the vector database backend behind
-> this project, supports multiple embedding engines. I developed this tool using
-> SentenceTransformer, but if you encounter any issues with a different embedding 
-> function, please open an issue (or even better, a pull request :D).
-
 <!-- mtoc-start -->
 
 * [Why VectorCode?](#why-vectorcode)
@@ -37,22 +32,23 @@ releases. Their capabilities on these projects are quite limited. With
 VectorCode, you can easily (and programmatically) inject task-relevant context
 from the project into the prompt. This significantly improves the quality of the
 model output and reduce hallucination.
-![](./images/codecompanion_chat.png)
+
+[![asciicast](https://asciinema.org/a/8WP8QJHNAR9lEllZSSx3poLPD.svg)](https://asciinema.org/a/8WP8QJHNAR9lEllZSSx3poLPD?t=3)
 
 ## Documentation
 
 > [!NOTE]
-> The documentation on the `main` branch reflects the code on the latest commit
-> (apologies if I forget to update the docs, but this will be what I aim for). To
-> check for the documentation for the version you're using, you can [check out
+> The documentation on the `main` branch reflects the code on the latest commit. 
+> To check for the documentation for the version you're using, you can [check out
 > the corresponding tags](https://github.com/Davidyz/VectorCode/tags).
 
 - For the setup and usage of the command-line tool, see [the CLI documentation](./docs/cli.md);
 - For neovim users, after you've gone through the CLI documentation, please refer to 
   [the neovim plugin documentation](./docs/neovim.md) for further instructions.
 - Additional resources:
   - the [wiki](https://github.com/Davidyz/VectorCode/wiki) for extra tricks and
-    tips that will help you get the most out of VectorCode;
+    tips that will help you get the most out of VectorCode, as well as
+    instructions to setup VectorCode to work with some other neovim plugins;
   - the [discussions](https://github.com/Davidyz/VectorCode/discussions) where
     you can ask general questions and share your cool usages about VectorCode.
 
@@ -98,7 +94,7 @@ This project follows an adapted semantic versioning:
 - [ ] ability to view and delete files in a collection (atm you can only `drop`
   and `vectorise` again);
 - [x] joint search (kinda, using codecompanion.nvim/MCP);
-- [ ] Nix support (#144);
+- [x] Nix support (unofficial packages [here](https://search.nixos.org/packages?channel=unstable&from=0&size=50&sort=relevance&type=packages&query=vectorcode));
 - [ ] Query rewriting (#124).
 
 

diff --git a/doc/VectorCode-cli.txt b/doc/VectorCode-cli.txt
@@ -121,8 +121,7 @@ significantly reduce the IO overhead and avoid potential race condition.
 
 
   If you’re setting up a standalone ChromaDB server, I recommend sticking to
-  v0.6.3. ChromaDB recently released v1.0.0, which may not work with VectorCode.
-  I’m testing with v1.0.0 and will publish a new release when it’s ready.
+  v0.6.3, because VectorCode is not ready for the upgrade to ChromaDB 1.0 yet.
 
 FOR WINDOWS USERS ~
 
@@ -146,6 +145,8 @@ NIX ~
 
 A community-maintained Nix package is available here
 <https://search.nixos.org/packages?channel=unstable&from=0&size=50&sort=relevance&type=packages&query=vectorcode>.
+If you’re using nix to install a standalone Chromadb server, make sure to
+stick to 0.6.3 <https://github.com/NixOS/nixpkgs/pull/412528>.
 
 
 GETTING STARTED  *VectorCode-cli-vectorcode-command-line-tool-getting-started*
@@ -212,7 +213,7 @@ REFRESHING EMBEDDINGS ~
 
 To maintain the accuracy of the vector search, it’s important to keep your
 embeddings up-to-date. You can simply run the `vectorise` subcommand on a file
-to refresh the embedding for a particular file, and the CLI provides a
+to refresh the embedding for that file. Apart from that, the CLI provides a
 `vectorcode update` subcommand, which updates the embeddings for all files that
 are currently indexed by VectorCode for the current project.
 
@@ -241,8 +242,8 @@ For each project, VectorCode creates a collection (similar to tables in
 traditional databases) and puts the code embeddings in the corresponding
 collection. In the root directory of a project, you may run `vectorcode init`.
 This will initialise the repository with a subdirectory
-`project_root/.vectorcode/`. This will mark this directory a _project root_, a
-concept that will later be used to construct the collection. You may put a
+`project_root/.vectorcode/`. This will mark this directory as a _project root_,
+a concept that will later be used to construct the collection. You may put a
 `config.json` file in `project_root/.vectorcode`. This file may be used to
 store project-specific settings such as embedding functions and database entry
 point (more on this later). If you already have a global configuration file at
@@ -272,31 +273,22 @@ hooks. The `init` subcommand provides a `--hooks` flag which helps you manage
 hooks when working with a git repository. You can put some custom hooks in
 `~/.config/vectorcode/hooks/` and the `vectorcode init --hooks` command will
 pick them up and append them to your existing hooks, or create new hook scripts
-if they don’t exist yet. The hook files should be named the same as they
-would be under the `.git/hooks` directory. For example, a pre-commit hook would
-be named `~/.config/vectorcode/hooks/pre-commit`.
+if they don’t exist yet. The custom hook files should be named the same as
+they would be under the `.git/hooks` directory. For example, a pre-commit hook
+would be named `~/.config/vectorcode/hooks/pre-commit`.
 
 By default, there are 2 pre-defined hooks:
 
->bash
-    # pre-commit hook that vectorise changed files before you commit.
-    diff_files=$(git diff --cached --name-only)
-    [ -z "$diff_files" ] || vectorcode vectorise $diff_files
-<
+1. A pre-commit hook that vectorises the modified files.
+2. A post-checkout hook that:- vectorises the full repository if it’s an initial commit/clone and a
+    `vectorcode.include` spec is available (either locally in the project or
+    globally);
+- vectorises the files changed by the checkout.
+
 
->bash
-    # post-checkout hook that vectorise changed files when you checkout to a
-    # different branch/tag/commit
-    files=$(git diff --name-only "$1" "$2")
-    [ -z "$files" ] || vectorcode vectorise $files
-<
 
-When you run `vectorcode init --hooks` in a git repo, these 2 hooks will be
-added to your `.git/hooks/`. Hooks that are managed by VectorCode will be
-wrapped by `# VECTORCODE_HOOK_START` and `# VECTORCODE_HOOK_END` comment lines.
-They help VectorCode determine whether hooks have been added, so don’t delete
-the markers unless you know what you’re doing. To remove the hooks, simply
-delete the lines wrapped by these 2 comment strings.
+Both hooks will only be triggered on repositories that have a `.vectorcode`
+directory in them.
 
 
 CONFIGURING VECTORCODE ~
@@ -328,31 +320,32 @@ model_name="nomic-embed-text")`. Default: `{}`; - `db_url`string, the url that
 points to the Chromadb server. VectorCode will start an HTTP server for
 Chromadb at a randomly picked free port on `localhost` if your configured
 `http://host:port` is not accessible. Default: `http://127.0.0.1:8000`; -
-`db_path`string, Path to local persistent database. This is where the files for
-your database will be stored. Default: `~/.local/share/vectorcode/chromadb/`; -
-`db_log_path`string, path to the _directory_ where the built-in chromadb server
-will write the log to. Default: `~/.local/share/vectorcode/`; -
-`chunk_size`integer, the maximum number of characters per chunk. A larger value
-reduces the number of items in the database, and hence accelerates the search,
-but at the cost of potentially truncated data and lost information. Default:
-`2500`. To disable chunking, set it to a negative number; -
-`overlap_ratio`float between 0 and 1, the ratio of overlapping/shared content
-between 2 adjacent chunks. A larger ratio improves the coherences of chunks,
-but at the cost of increasing number of entries in the database and hence
-slowing down the search. Default: `0.2`. _Starting from 0.4.11, VectorCode will
-use treesitter to parse languages that it can automatically detect. It uses
-pygments to guess the language from filename, and tree-sitter-language-pack to
-fetch the correct parser. overlap_ratio has no effects when treesitter works.
-If VectorCode fails to find an appropriate parser, it’ll fallback to the
-legacy naive parser, in which case overlap_ratio works exactly in the same way
-as before;_ - `query_multiplier`integer, when you use the `query` command to
-retrieve `n` documents, VectorCode will check `n * query_multiplier` chunks and
-return at most `n` documents. A larger value of `query_multiplier` guarantees
-the return of `n` documents, but with the risk of including too many
-less-relevant chunks that may affect the document selection. Default: `-1` (any
-negative value means selecting documents based on all indexed chunks); -
-`reranker`string, the reranking method to use. Currently supports
-`CrossEncoderReranker` (default, using sentence-transformers cross-encoder
+`db_path`string, Path to local persistent database. If you didn’t set up a
+standalone Chromadb server, this is where the files for your database will be
+stored. Default: `~/.local/share/vectorcode/chromadb/`; - `db_log_path`string,
+path to the _directory_ where the built-in chromadb server will write the log
+to. Default: `~/.local/share/vectorcode/`; - `chunk_size`integer, the maximum
+number of characters per chunk. A larger value reduces the number of items in
+the database, and hence accelerates the search, but at the cost of potentially
+truncated data and lost information. Default: `2500`. To disable chunking, set
+it to a negative number; - `overlap_ratio`float between 0 and 1, the ratio of
+overlapping/shared content between 2 adjacent chunks. A larger ratio improves
+the coherence of chunks, but at the cost of increasing number of entries in the
+database and hence slowing down the search. Default: `0.2`. _Starting from
+0.4.11, VectorCode will use treesitter to parse languages that it can
+automatically detect. It uses pygments to guess the language from filename, and
+tree-sitter-language-pack to fetch the correct parser. overlap_ratio has no
+effects when treesitter works. If VectorCode fails to find an appropriate
+parser, it’ll fallback to the legacy naive parser, in which case
+overlap_ratio works exactly in the same way as before;_ -
+`query_multiplier`integer, when you use the `query` command to retrieve `n`
+documents, VectorCode will check `n * query_multiplier` chunks and return at
+most `n` documents. A larger value of `query_multiplier` guarantees the return
+of `n` documents, but with the risk of including too many less-relevant chunks
+that may affect the document selection. Default: `-1` (any negative value means
+selecting documents based on all indexed chunks); - `reranker`string, the
+reranking method to use. Currently supports `CrossEncoderReranker` (default,
+using sentence-transformers cross-encoder
 <https://sbert.net/docs/package_reference/cross_encoder/cross_encoder.html> )
 and `NaiveReranker` (sort chunks by the "distance" between the embedding
 vectors); - `reranker_params`dictionary, similar to `embedding_params`. The
@@ -361,17 +354,16 @@ these are the options passed to the `CrossEncoder`
 <https://sbert.net/docs/package_reference/cross_encoder/cross_encoder.html#id1>
 class. For example, if you want to use a non-default model, you can use the
 following: `json { "reranker_params": { "model_name_or_path": "your_model_here"
-} }` ; - `db_settings`dictionary, works in a similar way to `embedding_params`,
+} }` - `db_settings`dictionary, works in a similar way to `embedding_params`,
 but for Chromadb client settings so that you can configure authentication for
 remote Chromadb <https://docs.trychroma.com/production/administration/auth>; -
 `hnsw`a dictionary of hnsw settings
 <https://cookbook.chromadb.dev/core/configuration/#hnsw-configuration> that may
 improve the query performances or avoid runtime errors during queries. **It’s
 recommended to re-vectorise the collection after modifying these options,
 because some of the options can only be set during collection creation.**
-Example: `json5 // the following is the default value. "hnsw": { "hnsw:M": 64,
-}` - `filetype_map``dict[str, list[str]]`, a dictionary where keys are language
-name
+Example (and default): `json5 "hnsw": { "hnsw:M": 64, }` -
+`filetype_map``dict[str, list[str]]`, a dictionary where keys are language name
 <https://github.com/Goldziher/tree-sitter-language-pack?tab=readme-ov-file#available-languages>
 and values are lists of Python regex patterns
 <https://docs.python.org/3/library/re.html> that will match file extensions.
@@ -566,7 +558,7 @@ the `VECTORCODE_LOG_LEVEL` variable to one of `ERROR`, `WARN` (`WARNING`),
 `INFO` or `DEBUG`. For the CLI that you interact with in your shell, this will
 output logs to `STDERR` and write a log file to
 `~/.local/share/vectorcode/logs/`. For LSP and MCP servers, because `STDIO` is
-used for the RPC, only the log file will be written.
+used for the RPC, the logs will only be written to the log file, not `STDERR`.
 
 For example:
 
@@ -575,6 +567,9 @@ For example:
 <
 
 
+  Depending on the MCP/LSP client implementation, you may need to take extra
+  steps to make sure the environment variables are captured by VectorCode.
+
 SHELL COMPLETION*VectorCode-cli-vectorcode-command-line-tool-shell-completion*
 
 VectorCode supports shell completion for bash/zsh/tcsh. You can use `vectorcode
@@ -602,9 +597,9 @@ following options in the JSON config file:
 For Intel users, sentence transformer <https://www.sbert.net/index.html>
 supports OpenVINO
 <https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html>
-backend for supported GPU. Run `pipx install vectorcode[intel]` which will
-bundle the relevant libraries when you install VectorCode. After that, you will
-need to configure `SentenceTransformer` to use `openvino` backend. In your
+backend for supported GPU. Run `uv install vectorcode[intel]` which will bundle
+the relevant libraries when you install VectorCode. After that, you will need
+to configure `SentenceTransformer` to use `openvino` backend. In your
 `config.json`, set `backend` key in `embedding_params` to `"openvino"`
 
 >json