-
Notifications
You must be signed in to change notification settings - Fork 5
Filters
Filters are applied as post-filters on vector search results and as pre-filters on BM25 search. When bitmap indexes are enabled (default), eligible filters are evaluated against pre-built RoaringBitmap indexes for sub-millisecond filtering.
Filters use a tagged JSON format with an "op" discriminator field:
{
"op": "<operator>",
...fields
}Match vectors where a field exactly equals a value.
{
"op": "eq",
"field": "color",
"value": "red"
}Match vectors where a field does not equal a value.
{
"op": "not_eq",
"field": "status",
"value": "deleted"
}Match vectors where a numeric field falls within bounds. All bounds are optional — use any combination of gte, lte, gt, lt.
{
"op": "range",
"field": "price",
"gte": 10.0,
"lte": 100.0
}{
"op": "range",
"field": "score",
"gt": 0.5
}Match vectors where a field's value is in a given set.
{
"op": "in",
"field": "category",
"values": ["electronics", "books", "toys"]
}Match vectors where a field's value is NOT in a given set.
{
"op": "not_in",
"field": "category",
"values": ["spam", "junk"]
}Match vectors where a list-type field contains a specific value.
{
"op": "contains",
"field": "tags",
"value": "rust"
}Match vectors where a text field contains all specified tokens (order-independent). Useful for FTS pre-filtering.
{
"op": "contains_all_tokens",
"field": "content",
"tokens": ["rust", "programming"]
}Match vectors where a text field contains tokens as an exact adjacent phrase.
{
"op": "contains_token_sequence",
"field": "content",
"tokens": ["vector", "search", "engine"]
}{
"op": "and",
"filters": [
{"op": "eq", "field": "color", "value": "red"},
{"op": "range", "field": "price", "lte": 50.0}
]
}{
"op": "or",
"filters": [
{"op": "eq", "field": "color", "value": "red"},
{"op": "eq", "field": "color", "value": "blue"}
]
}{
"op": "not",
"filter": {
"op": "eq",
"field": "archived",
"value": true
}
}{
"op": "and",
"filters": [
{"op": "eq", "field": "color", "value": "red"},
{"op": "range", "field": "price", "lt": 50.0},
{"op": "eq", "field": "in_stock", "value": true}
]
}{
"op": "and",
"filters": [
{
"op": "or",
"filters": [
{"op": "eq", "field": "category", "value": "engineering"},
{"op": "contains", "field": "tags", "value": "technical"}
]
},
{
"op": "not",
"filter": {"op": "eq", "field": "archived", "value": true}
}
]
}Filters match against attribute values stored during upsert. Supported types:
| Type | JSON Example | Description |
|---|---|---|
| String | "hello" |
Text value |
| Integer | 42 |
64-bit integer |
| Float | 3.14 |
64-bit float |
| Bool | true |
Boolean |
| StringList | ["a", "b"] |
List of strings (use contains) |
| IntegerList | [1, 2, 3] |
List of integers |
| FloatList | [1.5, 2.5] |
List of floats |
When bitmap_index is enabled (default: true), Zeppelin builds RoaringBitmap indexes during compaction for each attribute field. At query time:
- The filter is evaluated against bitmap indexes to produce a candidate set
- Only candidate vectors are scanned during distance computation
- If a filter cannot be fully resolved by bitmaps, it falls back to post-filtering
This is automatic and transparent — no query changes needed. Bitmap evaluation typically takes < 1ms even for millions of vectors.
Getting Started
API & SDKs
Configuration
Architecture
Operations