Add XContentFieldFilter (#81970) by MitchLewis930 · Pull Request #3 · Signal65/elasticsearch-CodeRabbit

MitchLewis930 · 2026-01-29T05:29:54Z

PR_013

Summary by CodeRabbit

Release Notes

Performance
- Optimized field filtering mechanism in document retrieval to improve performance when applying field inclusion and exclusion options to search results
- Enhanced source content handling to maintain efficient filtering performance across different serialization formats and compression states

_{✏️ Tip: You can customize this high-level summary in your review settings.}

This commit introduces XContentFieldFilter, which applies field includes/excludes to XContent without having to realise the xcontent itself as a java map. SourceFieldMapper and ShardGetService are cut over to use this class

coderabbitai · 2026-01-29T05:30:38Z

📝 Walkthrough

Walkthrough

This change introduces a new XContentFieldFilter interface for standardizing field filtering from BytesReference sources, adds a deprecated helper method for content-type inference, and refactors existing field-filtering logic in ShardGetService and SourceFieldMapper to use this new abstraction instead of Map-based approaches.

Changes

Cohort / File(s)	Summary
Field Filter Abstraction `server/src/main/java/org/elasticsearch/common/xcontent/XContentFieldFilter.java`	New public interface defining field filtering contract with `apply()` method and static factory `newFieldFilter()` that creates filters optimized for wildcard-based or direct parsing approaches.
Helper Methods `server/src/main/java/org/elasticsearch/common/xcontent/XContentHelper.java`	Added deprecated `xContentTypeMayCompressed()` helper method for inferring content type from potentially compressed `BytesReference` with fallback to decompression and type detection.
Filter Integration `server/src/main/java/org/elasticsearch/index/get/ShardGetService.java`, `server/src/main/java/org/elasticsearch/index/mapper/SourceFieldMapper.java`	Refactored source filtering logic to use `XContentFieldFilter` instead of Map-based conversions, eliminating intermediate Map creation and delegating field filtering to the new abstraction.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A filter so neat, with fields to refine,
No maps in between, just bytes so fine!
The interface blooms, abstraction takes flight,
Filtering fields with algorithmic delight! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: introducing a new XContentFieldFilter interface/class. It is specific, directly related to the primary code addition, and provides meaningful context for scanning commit history.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In
`@server/src/main/java/org/elasticsearch/common/xcontent/XContentFieldFilter.java`:
- Around line 65-72: The code in XContentFieldFilter uses
xContentTypeMayCompressed(...) but then feeds the original (possibly compressed)
bytes into the parser and leaves parser/builder unclosed; replace the manual
parser creation with XContentHelper.createParser(parserConfig, originalSource)
which handles compressed inputs, and wrap both the XContentParser and
XContentBuilder in a try-with-resources to ensure they are closed; call
builder.copyCurrentStructure(parser) inside the try block and return the bytes
from the builder/stream after closing (e.g., use BytesReference.bytes(builder)
or bytes from the stream output) so resources are released properly.

In `@server/src/main/java/org/elasticsearch/common/xcontent/XContentHelper.java`:
- Around line 521-545: The method xContentTypeMayCompressed leaks the
decompression InputStream created by
Compressor.threadLocalInputStream(bytes.streamInput()); wrap the creation of
compressedStreamInput (and the BufferedInputStream wrapper) in a
try-with-resources block so the stream is closed after calling
XContentFactory.xContentType(compressedStreamInput), preserving the existing
IOException handling and semantics; update xContentTypeMayCompressed to obtain
the XContentType inside the try-with-resources and return it (or rethrow as
UncheckedIOException as now) so compressor buffers are released reliably.

🧹 Nitpick comments (1)

server/src/main/java/org/elasticsearch/common/xcontent/XContentFieldFilter.java (1)

60-63: Reconsider Set.of() for user-supplied arrays that may contain duplicates.

Set.of(includes/excludes) will throw IllegalArgumentException if the arrays contain duplicate field names. While FetchSourceContext prevents null arrays, duplicates are possible if users specify the same field multiple times. Consider using Set.copyOf() (Java 10+) to deduplicate, or a HashSet for explicit flexibility—depending on whether duplicate field filters should error or be silently deduplicated.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c3a3f0 and 4560a0c.

📒 Files selected for processing (4)

server/src/main/java/org/elasticsearch/common/xcontent/XContentFieldFilter.java
server/src/main/java/org/elasticsearch/common/xcontent/XContentHelper.java
server/src/main/java/org/elasticsearch/index/get/ShardGetService.java
server/src/main/java/org/elasticsearch/index/mapper/SourceFieldMapper.java

🔇 Additional comments (2)

server/src/main/java/org/elasticsearch/index/mapper/SourceFieldMapper.java (1)

36-43: LGTM: filtering is now centralized via XContentFieldFilter.

This simplifies the apply path and keeps filtering byte-based without local map reconstruction.

Also applies to: 139-143, 173-178

server/src/main/java/org/elasticsearch/index/get/ShardGetService.java (1)

255-256: LGTM: request-level filtering now routes through XContentFieldFilter.

Keeps the fetch path consistent with the new abstraction while preserving error handling.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-29T05:41:48Z

server/src/main/java/org/elasticsearch/common/xcontent/XContentFieldFilter.java

+                if (contentType == null) {
+                    contentType = XContentHelper.xContentTypeMayCompressed(originalSource);
+                }
+                BytesStreamOutput streamOutput = new BytesStreamOutput(Math.min(1024, originalSource.length()));
+                XContentBuilder builder = new XContentBuilder(contentType.xContent(), streamOutput);
+                XContentParser parser = contentType.xContent().createParser(parserConfig, originalSource.streamInput());
+                builder.copyCurrentStructure(parser);
+                return BytesReference.bytes(builder);


⚠️ Potential issue | 🟠 Major

Parse compressed sources and close parser/builder.

xContentTypeMayCompressed(...) only infers type; the parser still reads the compressed bytes, which will fail when source is compressed. Also, the parser/builder aren’t closed. Use XContentHelper.createParser(...) and try-with-resources.

🔧 Proposed fix

- return (originalSource, contentType) -> { - if (contentType == null) { - contentType = XContentHelper.xContentTypeMayCompressed(originalSource); - } - BytesStreamOutput streamOutput = new BytesStreamOutput(Math.min(1024, originalSource.length())); - XContentBuilder builder = new XContentBuilder(contentType.xContent(), streamOutput); - XContentParser parser = contentType.xContent().createParser(parserConfig, originalSource.streamInput()); - builder.copyCurrentStructure(parser); - return BytesReference.bytes(builder); - }; + return (originalSource, contentType) -> { + if (contentType == null) { + contentType = XContentHelper.xContentTypeMayCompressed(originalSource); + } + BytesStreamOutput streamOutput = new BytesStreamOutput(Math.min(1024, originalSource.length())); + try ( + XContentParser parser = XContentHelper.createParser(parserConfig, originalSource, contentType); + XContentBuilder builder = new XContentBuilder(contentType.xContent(), streamOutput) + ) { + builder.copyCurrentStructure(parser); + return BytesReference.bytes(builder); + } + };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (contentType == null) {

contentType = XContentHelper.xContentTypeMayCompressed(originalSource);

}

BytesStreamOutput streamOutput = new BytesStreamOutput(Math.min(1024, originalSource.length()));

XContentBuilder builder = new XContentBuilder(contentType.xContent(), streamOutput);

XContentParser parser = contentType.xContent().createParser(parserConfig, originalSource.streamInput());

builder.copyCurrentStructure(parser);

return BytesReference.bytes(builder);

if (contentType == null) {

contentType = XContentHelper.xContentTypeMayCompressed(originalSource);

}

BytesStreamOutput streamOutput = new BytesStreamOutput(Math.min(1024, originalSource.length()));

try (

XContentParser parser = XContentHelper.createParser(parserConfig, originalSource, contentType);

XContentBuilder builder = new XContentBuilder(contentType.xContent(), streamOutput)

) {

builder.copyCurrentStructure(parser);

return BytesReference.bytes(builder);

}

🤖 Prompt for AI Agents

In `@server/src/main/java/org/elasticsearch/common/xcontent/XContentFieldFilter.java` around lines 65 - 72, The code in XContentFieldFilter uses xContentTypeMayCompressed(...) but then feeds the original (possibly compressed) bytes into the parser and leaves parser/builder unclosed; replace the manual parser creation with XContentHelper.createParser(parserConfig, originalSource) which handles compressed inputs, and wrap both the XContentParser and XContentBuilder in a try-with-resources to ensure they are closed; call builder.copyCurrentStructure(parser) inside the try block and return the bytes from the builder/stream after closing (e.g., use BytesReference.bytes(builder) or bytes from the stream output) so resources are released properly.

coderabbitai · 2026-01-29T05:41:48Z

server/src/main/java/org/elasticsearch/common/xcontent/XContentHelper.java

+    /**
+     * Guesses the content type based on the provided bytes which may be compressed.
+     *
+     * @deprecated the content type should not be guessed except for few cases where we effectively don't know the content type.
+     * The REST layer should move to reading the Content-Type header instead. There are other places where auto-detection may be needed.
+     * This method is deprecated to prevent usages of it from spreading further without specific reasons.
+     */
+    @Deprecated
+    public static XContentType xContentTypeMayCompressed(BytesReference bytes) {
+        Compressor compressor = CompressorFactory.compressor(bytes);
+        if (compressor != null) {
+            try {
+                InputStream compressedStreamInput = compressor.threadLocalInputStream(bytes.streamInput());
+                if (compressedStreamInput.markSupported() == false) {
+                    compressedStreamInput = new BufferedInputStream(compressedStreamInput);
+                }
+                return XContentFactory.xContentType(compressedStreamInput);
+            } catch (IOException e) {
+                assert false : "Should not happen, we're just reading bytes from memory";
+                throw new UncheckedIOException(e);
+            }
+        } else {
+            return XContentHelper.xContentType(bytes);
+        }
+    }


⚠️ Potential issue | 🟠 Major

Close the decompression stream to avoid leaking buffers.

compressor.threadLocalInputStream(...) is never closed here, unlike other call sites in this class. Wrap it in try-with-resources so compressor buffers are released reliably.

🔧 Proposed fix

- if (compressor != null) { - try { - InputStream compressedStreamInput = compressor.threadLocalInputStream(bytes.streamInput()); - if (compressedStreamInput.markSupported() == false) { - compressedStreamInput = new BufferedInputStream(compressedStreamInput); - } - return XContentFactory.xContentType(compressedStreamInput); + if (compressor != null) { + try ( + InputStream raw = compressor.threadLocalInputStream(bytes.streamInput()); + InputStream in = raw.markSupported() ? raw : new BufferedInputStream(raw) + ) { + return XContentFactory.xContentType(in); } catch (IOException e) { assert false : "Should not happen, we're just reading bytes from memory"; throw new UncheckedIOException(e); } } else {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/**

* Guesses the content type based on the provided bytes which may be compressed.

*

* @deprecated the content type should not be guessed except for few cases where we effectively don't know the content type.

* The REST layer should move to reading the Content-Type header instead. There are other places where auto-detection may be needed.

* This method is deprecated to prevent usages of it from spreading further without specific reasons.

*/

@Deprecated

public static XContentType xContentTypeMayCompressed(BytesReference bytes) {

Compressor compressor = CompressorFactory.compressor(bytes);

if (compressor != null) {

try {

InputStream compressedStreamInput = compressor.threadLocalInputStream(bytes.streamInput());

if (compressedStreamInput.markSupported() == false) {

compressedStreamInput = new BufferedInputStream(compressedStreamInput);

}

return XContentFactory.xContentType(compressedStreamInput);

} catch (IOException e) {

assert false : "Should not happen, we're just reading bytes from memory";

throw new UncheckedIOException(e);

}

} else {

return XContentHelper.xContentType(bytes);

}

}

/**

* Guesses the content type based on the provided bytes which may be compressed.

*

* `@deprecated` the content type should not be guessed except for few cases where we effectively don't know the content type.

* The REST layer should move to reading the Content-Type header instead. There are other places where auto-detection may be needed.

* This method is deprecated to prevent usages of it from spreading further without specific reasons.

*/

`@Deprecated`

public static XContentType xContentTypeMayCompressed(BytesReference bytes) {

Compressor compressor = CompressorFactory.compressor(bytes);

if (compressor != null) {

try (

InputStream raw = compressor.threadLocalInputStream(bytes.streamInput());

InputStream in = raw.markSupported() ? raw : new BufferedInputStream(raw)

) {

return XContentFactory.xContentType(in);

} catch (IOException e) {

assert false : "Should not happen, we're just reading bytes from memory";

throw new UncheckedIOException(e);

}

} else {

return XContentHelper.xContentType(bytes);

}

}

🤖 Prompt for AI Agents

In `@server/src/main/java/org/elasticsearch/common/xcontent/XContentHelper.java` around lines 521 - 545, The method xContentTypeMayCompressed leaks the decompression InputStream created by Compressor.threadLocalInputStream(bytes.streamInput()); wrap the creation of compressedStreamInput (and the BufferedInputStream wrapper) in a try-with-resources block so the stream is closed after calling XContentFactory.xContentType(compressedStreamInput), preserving the existing IOException handling and semantics; update xContentTypeMayCompressed to obtain the XContentType inside the try-with-resources and return it (or rethrow as UncheckedIOException as now) so compressor buffers are released reliably.

Add XContentFieldFilter (elastic#81970)

4560a0c

This commit introduces XContentFieldFilter, which applies field includes/excludes to XContent without having to realise the xcontent itself as a java map. SourceFieldMapper and ShardGetService are cut over to use this class

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add XContentFieldFilter (#81970)#3

Add XContentFieldFilter (#81970)#3
MitchLewis930 wants to merge 1 commit intopr_013_beforefrom
pr_013_after

MitchLewis930 commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 29, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 29, 2026

Uh oh!

coderabbitai bot Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

MitchLewis930 commented Jan 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MitchLewis930 commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 29, 2026 •

edited

Loading