Adding better queries #2
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Release v0.3.0: Enhanced Chunking API, Extension Fields, and Filterable Vector Search
🎯 Overview
This release introduces a major refactor of the chunking API, adds support for extension fields, and enables powerful filtering capabilities in vector search. The plugin now uses Drizzle ORM throughout, eliminating raw SQL and providing a more maintainable, type-safe codebase.
1. Field-Based Chunking Replaced with
toKnowledgePoolFunctionsBefore (v0.2.x):
After (v0.3.0):
Why: This change provides full control over chunking logic, allowing you to:
2.
fieldPathRemoved from Search ResultsThe
fieldPathproperty has been removed fromVectorSearchResult. If you were using this to identify which field a chunk came from, you'll need to track this via extension fields or your chunking logic.Before:
After:
✨ New Features
1. Extension Fields
Add custom fields to the embeddings collection schema and persist values per chunk:
Extension fields are:
whereclause (see below)2. Filterable Vector Search
The vector search endpoint now accepts Payload-style
whereclauses and alimitparameter:Supported operators:
equals,not_equals/notEqualsin,not_in/notInlike,containsgreater_than/greaterThan,greater_than_equal/greaterThanEqualless_than/lessThan,less_than_equal/lessThanEqualexists(null checks)and/orconditionsYou can filter on:
sourceCollection,docId,chunkIndex,chunkText,embeddingVersion3. Improved Chunking Control
The
toKnowledgePoolfunction gives you complete control over:Example: Chunk a blog post's title separately from its content, and attach different metadata to each:
🔧 Technical Improvements
Drizzle ORM Integration
_properties, ensuring forward compatibilityCustom WHERE Clause Converter
Implemented a custom
convertWhereToDrizzlefunction that:Whereobjects to Drizzle conditionsand/orlogicDynamic Table Registration
The plugin now dynamically generates Drizzle table definitions during schema initialization and stores them in a registry. This allows:
📝 Migration Guide
Step 1: Update Your Collection Configuration
Replace field-based chunking with
toKnowledgePoolfunctions:Step 2: Update Search Result Handling
Remove any code that references
fieldPath:Step 3: (Optional) Add Extension Fields
If you want to store and query custom metadata:
Then update your
toKnowledgePoolfunction to return these values:Step 4: Re-vectorize Your Content
After updating your configuration, you'll need to re-vectorize existing documents. The plugin will automatically:
🧪 Testing
dev/specs/extensionFields.spec.ts)📚 Documentation
Full Changelog: See CHANGELOG.md for complete details.