generated from amazon-archives/__template_Custom
-
Notifications
You must be signed in to change notification settings - Fork 181
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
What is the bug?
Sibling issue of opensearch-project/OpenSearch#20218: top and other aggregations won't work in cases where a field has "type": "nested". If this is intended behavior, we'll have to patch in a workaround. If it's not intended, we can see about waiting on an upstream fix.
How can one reproduce the bug?
If a field has "type": "nested", any command that compiles to a bucket aggregation will fail to return data.
Data & Mapping:
% cat test_map.json | jq
{
"mappings": {
"properties": {
"rec": {
"type": "nested",
"properties": {
"inner": {
"type": "integer"
}
}
}
}
}
}
% cat test_data.json
{"create": {"_index": "test"}}
{"rec":{"inner": 100}}
{"create": {"_index": "test"}}
{"rec":{"inner": 101}}
{"create": {"_index": "test"}}
{"rec":{"inner": 101}}Queries that depend on composite buckets:
opensearchsql> source=test | top rec.inner;
fetched rows / total rows = 1/1
+-----------+-------+
| rec.inner | count |
|-----------+-------|
| null | 3 |
+-----------+-------+
opensearchsql> source=test | stats count() by rec.inner;
fetched rows / total rows = 1/1
+---------+-----------+
| count() | rec.inner |
|---------+-----------|
| 3 | null |
+---------+-----------+
> source=test | dedup rec.inner;
fetched rows / total rows = 0/0
+-----+
| rec |
|-----|
+-----+Explained:
opensearchsql> explain source=test | top rec.inner;
= Calcite Plan =
== Logical ==
LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])
LogicalProject(rec.inner=[$0], count=[$1])
LogicalFilter(condition=[<=($2, 10)])
LogicalProject(rec.inner=[$0], count=[$1], _row_number_=[ROW_NUMBER() OVER (ORDER BY $1 DESC)])
LogicalAggregate(group=[{0}], count=[COUNT()])
LogicalProject(rec.inner=[$1])
CalciteLogicalIndexScan(table=[[OpenSearch, test]])
== Physical ==
EnumerableLimit(fetch=[10000])
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[10], expr#4=[<=($t2, $t3)], proj#0..1=[{exprs}], $condition=[$t4])
EnumerableWindow(window#0=[window(order by [1 DESC] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
CalciteEnumerableIndexScan(table=[[OpenSearch, test]], PushDownContext=[[AGGREGATION->rel#597:LogicalAggregate.NONE.[](input=RelSubset#596,group={0},count=COUNT())], OpenSearchRequestBuilder(sourceBuilder={
"from": 0,
"size": 0,
"timeout": "1m",
"aggregations": {
"composite_buckets": {
"composite": {
"size": 1000,
"sources": [
{
"rec.inner": {
"terms": {
"field": "rec.inner",
"missing_bucket": true,
"missing_order": "first",
"order": "asc"
}
}
}
]
}
}
}
}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])
opensearchsql> explain source=test | stats count() by rec.inner;
= Calcite Plan =
== Logical ==
LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])
LogicalProject(count()=[$1], rec.inner=[$0])
LogicalAggregate(group=[{0}], count()=[COUNT()])
LogicalProject(rec.inner=[$1])
CalciteLogicalIndexScan(table=[[OpenSearch, test]])
== Physical ==
CalciteEnumerableIndexScan(table=[[OpenSearch, test]], PushDownContext=[[AGGREGATION->rel#1164:LogicalAggregate.NONE.[](input=RelSubset#1163,group={0},count()=COUNT()), PROJECT->[count(), rec.inner], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={
"from": 0,
"size": 0,
"timeout": "1m",
"aggregations": {
"composite_buckets": {
"composite": {
"size": 10000,
"sources": [
{
"rec.inner": {
"terms": {
"field": "rec.inner",
"missing_bucket": true,
"missing_order": "first",
"order": "asc"
}
}
}
]
}
}
}
}, requestedTotalSize=10000, pageSize=null, startFrom=0)])
opensearchsql> explain source=test | dedup rec.inner;
= Calcite Plan =
== Logical ==
LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])
LogicalProject(rec=[$0])
LogicalFilter(condition=[<=($8, 1)])
LogicalProject(rec=[$0], rec.inner=[$1], _id=[$2], _index=[$3], _score=[$4], _maxscore=[$5], _sort=[$6], _routing=[$7], _row_number_dedup_=[ROW_NUMBER() OVER (PARTITION BY $1 ORDER BY $1)])
LogicalFilter(condition=[IS NOT NULL($1)])
CalciteLogicalIndexScan(table=[[OpenSearch, test]])
== Physical ==
CalciteEnumerableIndexScan(table=[[OpenSearch, test]], PushDownContext=[[PROJECT->[rec, rec.inner], AGGREGATION->rel#953:LogicalAggregate.NONE.[](input=LogicalProject#951,group={0},agg#0=LITERAL_AGG(1)), LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={
"from": 0,
"size": 0,
"timeout": "1m",
"_source": {
"includes": [
"rec",
"rec.inner"
],
"excludes": []
},
"aggregations": {
"composite_buckets": {
"composite": {
"size": 10000,
"sources": [
{
"rec.inner": {
"terms": {
"field": "rec.inner",
"missing_bucket": false,
"order": "asc"
}
}
}
]
},
"aggregations": {
"$f1": {
"top_hits": {
"from": 0,
"size": 1,
"version": false,
"seq_no_primary_term": false,
"explain": false,
"_source": {
"includes": [
"rec.inner",
"rec"
],
"excludes": []
},
"script_fields": {}
}
}
}
}
}
}, requestedTotalSize=10000, pageSize=null, startFrom=0)])What is the expected behavior?
opensearchsql> source=test | top rec.inner;
fetched rows / total rows = 2/2
+-----------+-------+
| rec.inner | count |
|-----------+-------|
| 101 | 2 |
| 100 | 1 |
+-----------+-------+What is your host/environment?
- OS: Linux 5.10
- Version: mainline, 3.3.2
- Plugins: Yours truly
Do you have any screenshots?
N/A
Do you have any additional context?
N/A
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
Done