Skip to content

Add I/O statistics on Linux#5

Open
MitchLewis930 wants to merge 1 commit intopr_015_beforefrom
pr_015_after
Open

Add I/O statistics on Linux#5
MitchLewis930 wants to merge 1 commit intopr_015_beforefrom
pr_015_after

Conversation

@MitchLewis930
Copy link

@MitchLewis930 MitchLewis930 commented Jan 29, 2026

PR_015

Summary by CodeRabbit

  • New Features

    • Added Linux-specific filesystem I/O statistics to the nodes stats API, including per-device and aggregated metrics for operations, reads, writes, and data transferred.
  • Documentation

    • Updated API documentation with details on new I/O statistics fields and metrics available on Linux.

✏️ Tip: You can customize this high-level summary in your review settings.

This commit adds a variety of real disk metrics for the block devices
that back Elasticsearch data paths. A collection of statistics are read
from /proc/diskstats and are used to report the raw metrics for
operations and read/write bytes.

Relates elastic#15915
@coderabbitai
Copy link

coderabbitai bot commented Jan 29, 2026

📝 Walkthrough

Walkthrough

This PR introduces Linux-specific device I/O statistics collection to Elasticsearch's filesystem monitoring. It adds device metadata tracking via /proc/self/mountinfo, collects I/O metrics from /proc/diskstats, and exposes aggregated and per-device statistics through a new IoStats data structure with caching integration.

Changes

Cohort / File(s) Summary
Device metadata tracking
core/src/main/java/org/elasticsearch/env/ESFileStore.java, core/src/main/java/org/elasticsearch/env/NodeEnvironment.java
Added majorDeviceNumber and minorDeviceNumber fields to ESFileStore via /proc/self/mountinfo parsing and exposed them in NodePath constructor through lucene:major_device_number and lucene:minor_device_number attributes.
I/O statistics data structures
core/src/main/java/org/elasticsearch/monitor/fs/FsInfo.java
Introduced public nested classes DeviceStats and IoStats for per-device and aggregated I/O metrics with Writeable/ToXContent support; updated FsInfo constructor and serialization to include an optional IoStats field.
I/O statistics collection
core/src/main/java/org/elasticsearch/monitor/fs/FsProbe.java, core/src/main/java/org/elasticsearch/monitor/fs/FsService.java
Added Linux-only I/O stats computation by parsing /proc/diskstats with device filtering; updated FsProbe.stats() signature to accept previous FsInfo and enhanced FsService caching with error handling via new stats() helper method.
Service cleanup
core/src/main/java/org/elasticsearch/monitor/MonitorService.java
Removed unused private service field declarations (JvmGcMonitorService, OsService, ProcessService, JvmService, FsService).
Security and permissions
core/src/main/resources/org/elasticsearch/bootstrap/security.policy
Added read permission for /proc/diskstats to enable I/O statistics access on Linux.
Tests
core/src/test/java/org/elasticsearch/cluster/DiskUsageTests.java, core/src/test/java/org/elasticsearch/monitor/fs/DeviceStatsTests.java, core/src/test/java/org/elasticsearch/monitor/fs/FsProbeTests.java
Updated FsInfo constructor calls to include null IoStats parameter; added DeviceStatsTests unit test; added comprehensive testIoStats() with mocked /proc/diskstats data and conditional assertions for Linux-specific metrics.
Documentation
docs/reference/cluster/nodes-stats.asciidoc
Documented new Linux-specific fs.io_stats fields including per-device metrics (device_name, operations, read/write operations and kilobytes) and aggregated totals.
Test framework
test/framework/src/main/java/org/elasticsearch/cluster/MockInternalClusterInfoService.java
Updated FsInfo constructor invocation to pass null IoStats parameter as second argument.

Sequence Diagram

sequenceDiagram
    participant Client as Client / API
    participant FSService as FsService
    participant FSProbe as FsProbe
    participant ESFileStore as ESFileStore
    participant ProcFS as /proc (Linux)
    participant Cache as Stats Cache
    
    Client->>FSService: getStats()
    FSService->>Cache: check cached value
    alt Cache expired
        FSService->>FSProbe: stats(previousFsInfo)
        FSProbe->>ESFileStore: get device numbers
        ESFileStore->>ProcFS: read /proc/self/mountinfo
        ProcFS-->>ESFileStore: major/minor device numbers
        FSProbe->>ProcFS: read /proc/diskstats
        ProcFS-->>FSProbe: disk statistics
        FSProbe->>FSProbe: compute IoStats (delta from previous)
        FSProbe-->>FSService: new FsInfo with IoStats
        FSService->>Cache: store FsInfo
    end
    Cache-->>FSService: FsInfo
    FSService-->>Client: FsInfo with io_stats
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 Hop, skip, and a device number jump!
Counting sectors, reads, and writes in a lump,
From /proc we gather each disk's busy tale,
I/O stats blooming, a metric-filled trail! 📊

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: adding I/O statistics collection specifically for Linux systems, which is the primary focus of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@MitchLewis930
Copy link
Author

@CodeRabbit full review

@coderabbitai
Copy link

coderabbitai bot commented Jan 29, 2026

✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/src/main/java/org/elasticsearch/env/ESFileStore.java (1)

48-78: Initialize device numbers to -1 when mountinfo lookup fails or on non‑Linux.

Right now majorDeviceNumber/minorDeviceNumber remain 0 if the mount point isn’t found (or on non‑Linux), which can be interpreted as a real device. Defaulting to -1 and breaking on the first match avoids false positives.

🛠️ Suggested fix
         if (Constants.LINUX) {
             try {
                 spins = IOUtils.spins(PathUtils.get(getMountPointLinux(in)));
             } catch (Exception e) {
                 spins = null;
             }
+            majorDeviceNumber = -1;
+            minorDeviceNumber = -1;
             try {
-                final List<String> lines = Files.readAllLines(PathUtils.get("/proc/self/mountinfo"));
+                final String mountPoint = getMountPointLinux(in);
+                final List<String> lines = Files.readAllLines(PathUtils.get("/proc/self/mountinfo"));
                 for (final String line : lines) {
                     final String[] fields = line.trim().split("\\s+");
-                    final String mountPoint = fields[4];
-                    if (mountPoint.equals(getMountPointLinux(in))) {
+                    if (fields.length > 4 && mountPoint.equals(fields[4])) {
                         final String[] deviceNumbers = fields[2].split(":");
                         majorDeviceNumber = Integer.parseInt(deviceNumbers[0]);
                         minorDeviceNumber = Integer.parseInt(deviceNumbers[1]);
+                        break;
                     }
                 }
             } catch (Exception e) {
                 majorDeviceNumber = -1;
                 minorDeviceNumber = -1;
             }
         } else {
             spins = null;
+            majorDeviceNumber = -1;
+            minorDeviceNumber = -1;
         }
🤖 Fix all issues with AI agents
In `@core/src/main/java/org/elasticsearch/monitor/fs/FsInfo.java`:
- Around line 389-405: In IoStats.toXContent (inside FsInfo) the
builder.startObject("total") call is not closed: after writing totalOperations,
totalReadOperations, totalWriteOperations, totalReadKilobytes and
totalWriteKilobytes you must call builder.endObject() to close the "total"
object; locate the block that iterates devicesStats and the code that starts the
"total" object (references: devicesStats, DeviceStats, totalOperations,
totalReadOperations, totalWriteOperations, totalReadKilobytes,
totalWriteKilobytes) and insert a builder.endObject() immediately after those
field() calls so the JSON structure is balanced.

In `@core/src/main/java/org/elasticsearch/monitor/fs/FsService.java`:
- Around line 67-80: The refresh() implementation in FsInfoCache is incorrectly
passing the fixed initialValue to stats(...) causing deltas to be computed
against the first sample; change the design so refresh() uses the current cached
FsInfo instead of initialValue: add a protected accessor in SingleObjectCache
(e.g., protected T getNoRefresh() or getCached()) that returns the current
cached value without triggering a refresh, then update FsInfoCache.refresh() to
call stats(probe, getNoRefresh(), logger) (or, alternatively, add a
previousField in FsInfoCache and update it on each successful refresh) and
ensure any added accessor maintains the existing concurrency/visibility
guarantees of SingleObjectCache.

In `@docs/reference/cluster/nodes-stats.asciidoc`:
- Around line 124-171: The docs incorrectly describe fs.io_stats fields as
"averages between probes" or "since starting Elasticsearch" but the
implementation reports deltas between consecutive snapshots; update the wording
for fs.io_stats.devices and its children (fs.io_stats.devices.device_name,
.operations, .read_operations, .write_operations, .read_kilobytes,
.write_kilobytes) and the aggregate fields (fs.io_stats.operations,
fs.io_stats.read_operations, fs.io_stats.write_operations,
fs.io_stats.read_kilobytes, fs.io_stats.write_kilobytes) to state these values
represent deltas over the probe interval (i.e., change since the previous
probe), and clarify that aggregate fields are computed from per-device deltas
rather than being cumulative since node start or averaged.
🧹 Nitpick comments (1)
core/src/main/java/org/elasticsearch/monitor/fs/FsProbe.java (1)

83-108: Add defensive bounds checking for /proc/diskstats parsing to skip malformed lines instead of failing entirely.

The code accesses array indices 0–9 to parse /proc/diskstats fields (major, minor, device name, reads completed, sectors read, writes completed, sectors written). While the /proc/diskstats format defines at least 20 fields on modern Linux kernels, no explicit bounds check prevents ArrayIndexOutOfBoundsException if a line has fewer fields. Currently, such an exception propagates to the generic catch block, causing the entire method to return null and lose IO statistics for all devices.

Skipping malformed lines gracefully is more robust:

♻️ Proposed defensive parsing
             for (String line : lines) {
-                String fields[] = line.trim().split("\\s+");
+                String[] fields = line.trim().split("\\s+");
+                if (fields.length < 10) {
+                    continue; // skip malformed lines
+                }
                 final int majorDeviceNumber = Integer.parseInt(fields[0]);
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4003d3f and ecce53f.

📒 Files selected for processing (12)
  • core/src/main/java/org/elasticsearch/env/ESFileStore.java
  • core/src/main/java/org/elasticsearch/env/NodeEnvironment.java
  • core/src/main/java/org/elasticsearch/monitor/MonitorService.java
  • core/src/main/java/org/elasticsearch/monitor/fs/FsInfo.java
  • core/src/main/java/org/elasticsearch/monitor/fs/FsProbe.java
  • core/src/main/java/org/elasticsearch/monitor/fs/FsService.java
  • core/src/main/resources/org/elasticsearch/bootstrap/security.policy
  • core/src/test/java/org/elasticsearch/cluster/DiskUsageTests.java
  • core/src/test/java/org/elasticsearch/monitor/fs/DeviceStatsTests.java
  • core/src/test/java/org/elasticsearch/monitor/fs/FsProbeTests.java
  • docs/reference/cluster/nodes-stats.asciidoc
  • test/framework/src/main/java/org/elasticsearch/cluster/MockInternalClusterInfoService.java
🧰 Additional context used
🧬 Code graph analysis (3)
core/src/main/java/org/elasticsearch/monitor/fs/FsProbe.java (1)
core/src/main/java/org/elasticsearch/env/NodeEnvironment.java (2)
  • NodeEnvironment (78-941)
  • NodePath (79-131)
core/src/test/java/org/elasticsearch/monitor/fs/FsProbeTests.java (1)
core/src/main/java/org/elasticsearch/env/NodeEnvironment.java (1)
  • NodeEnvironment (78-941)
core/src/main/java/org/elasticsearch/monitor/fs/FsService.java (3)
core/src/main/java/org/elasticsearch/common/logging/ESLogger.java (1)
  • ESLogger (30-206)
core/src/main/java/org/elasticsearch/common/settings/Setting.java (1)
  • Setting (76-970)
core/src/main/java/org/elasticsearch/common/unit/TimeValue.java (1)
  • TimeValue (37-345)
🔇 Additional comments (21)
core/src/main/resources/org/elasticsearch/bootstrap/security.policy (1)

127-129: LGTM: scoped /proc/diskstats read permission.

This is the minimal permission needed for Linux I/O stats and is narrowly scoped to the exact proc path.

core/src/test/java/org/elasticsearch/cluster/DiskUsageTests.java (2)

203-208: LGTM: FsInfo constructor updated for IoStats parameter.

Passing null here is consistent with the new signature for tests that don't exercise IO stats.


244-249: LGTM: FsInfo constructor updated for IoStats parameter.

This keeps test setup aligned with the new FsInfo API.

core/src/main/java/org/elasticsearch/env/ESFileStore.java (1)

250-256: LGTM: lucene device attributes exposed.

This cleanly surfaces device metadata to consumers via the lucene attribute view.

core/src/main/java/org/elasticsearch/env/NodeEnvironment.java (1)

90-104: LGTM: NodePath captures device numbers from lucene attributes.

This aligns NodePath metadata with the new ESFileStore attributes.

core/src/main/java/org/elasticsearch/monitor/fs/FsInfo.java (6)

191-323: LGTM: DeviceStats delta computation and serialization.

The structure cleanly carries current/previous counters and exposes derived fields.


411-419: LGTM: FsInfo now carries IoStats.

The new field and constructor wiring are straightforward.


426-439: Verify stream compatibility for the new ioStats field.

FsInfo is serialized over the wire; adding an unconditional optional field can break mixed‑version clusters unless it’s gated by stream version. Please confirm version checks are in place (or that BWC isn’t required for this branch).


471-473: LGTM: IoStats accessor added.


491-495: LGTM: io_stats included in fs XContent output.


500-506: LGTM: io_stats field key added.

test/framework/src/main/java/org/elasticsearch/cluster/MockInternalClusterInfoService.java (1)

77-77: LGTM!

The null IoStats parameter correctly aligns with the updated FsInfo constructor signature. For mock/test purposes, omitting IO stats is appropriate since this creates fake disk usage data.

core/src/main/java/org/elasticsearch/monitor/fs/FsProbe.java (3)

22-38: LGTM!

Imports are appropriate for the new Linux-specific IO stats functionality.


49-69: LGTM!

The updated stats() method correctly:

  • Accepts a previous FsInfo for delta calculations
  • Guards IO stats collection behind Constants.LINUX
  • Filters out invalid device numbers (-1)

120-123: LGTM!

The method is appropriately:

  • Package-private for test overriding
  • Annotated with @SuppressForbidden for legitimate system file access
  • Simple and focused
core/src/test/java/org/elasticsearch/monitor/fs/DeviceStatsTests.java (1)

28-60: LGTM!

The test correctly validates DeviceStats delta calculations:

  • Operations sum (reads + writes)
  • Kilobyte conversion (sectors / 2, since sectors are 512 bytes)

Consider adding edge case tests in the future for scenarios like counter wrap-around or missing previous stats, but this is adequate for initial coverage.

core/src/test/java/org/elasticsearch/monitor/fs/FsProbeTests.java (2)

48-69: LGTM!

The test correctly validates:

  • Linux path: IoStats presence, DeviceStats initialization with -1 for previous values
  • Non-Linux path: IoStats is null

Good platform-conditional test coverage.


89-175: LGTM!

Excellent test coverage that validates:

  • Mock /proc/diskstats injection via subclass override
  • Two-sample delta calculation (first sample with no previous, second with previous)
  • Per-device stats accuracy
  • Aggregate totals correctness

The test thoroughly exercises the ioStats() method's parsing and calculation logic.

core/src/main/java/org/elasticsearch/monitor/fs/FsService.java (2)

46-52: LGTM!

Constructor correctly initializes the cache with an initial probe reading (passing null for first sample).


58-65: LGTM!

The static helper appropriately encapsulates error handling with debug-level logging, preventing probe failures from crashing the service.

core/src/main/java/org/elasticsearch/monitor/MonitorService.java (1)

81-82: LGTM!

No functional changes in this file - only a trailing blank line formatting adjustment.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +389 to +405
@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
if (devicesStats.length > 0) {
builder.startArray("devices");
for (DeviceStats deviceStats : devicesStats) {
builder.startObject();
deviceStats.toXContent(builder, params);
builder.endObject();
}
builder.endArray();
builder.startObject("total");
builder.field(OPERATIONS, totalOperations);
builder.field(READ_OPERATIONS, totalReadOperations);
builder.field(WRITE_OPERATIONS, totalWriteOperations);
builder.field(READ_KILOBYTES, totalReadKilobytes);
builder.field(WRITE_KILOBYTES, totalWriteKilobytes);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Close the total object in IoStats.toXContent.

builder.startObject("total") is never closed, which leaves the JSON structure unbalanced and causes the outer io_stats/fs objects to close incorrectly.

🛠️ Suggested fix
             builder.startObject("total");
             builder.field(OPERATIONS, totalOperations);
             builder.field(READ_OPERATIONS, totalReadOperations);
             builder.field(WRITE_OPERATIONS, totalWriteOperations);
             builder.field(READ_KILOBYTES, totalReadKilobytes);
             builder.field(WRITE_KILOBYTES, totalWriteKilobytes);
+            builder.endObject();
         }
         return builder;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
if (devicesStats.length > 0) {
builder.startArray("devices");
for (DeviceStats deviceStats : devicesStats) {
builder.startObject();
deviceStats.toXContent(builder, params);
builder.endObject();
}
builder.endArray();
builder.startObject("total");
builder.field(OPERATIONS, totalOperations);
builder.field(READ_OPERATIONS, totalReadOperations);
builder.field(WRITE_OPERATIONS, totalWriteOperations);
builder.field(READ_KILOBYTES, totalReadKilobytes);
builder.field(WRITE_KILOBYTES, totalWriteKilobytes);
}
`@Override`
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
if (devicesStats.length > 0) {
builder.startArray("devices");
for (DeviceStats deviceStats : devicesStats) {
builder.startObject();
deviceStats.toXContent(builder, params);
builder.endObject();
}
builder.endArray();
builder.startObject("total");
builder.field(OPERATIONS, totalOperations);
builder.field(READ_OPERATIONS, totalReadOperations);
builder.field(WRITE_OPERATIONS, totalWriteOperations);
builder.field(READ_KILOBYTES, totalReadKilobytes);
builder.field(WRITE_KILOBYTES, totalWriteKilobytes);
builder.endObject();
}
🤖 Prompt for AI Agents
In `@core/src/main/java/org/elasticsearch/monitor/fs/FsInfo.java` around lines 389
- 405, In IoStats.toXContent (inside FsInfo) the builder.startObject("total")
call is not closed: after writing totalOperations, totalReadOperations,
totalWriteOperations, totalReadKilobytes and totalWriteKilobytes you must call
builder.endObject() to close the "total" object; locate the block that iterates
devicesStats and the code that starts the "total" object (references:
devicesStats, DeviceStats, totalOperations, totalReadOperations,
totalWriteOperations, totalReadKilobytes, totalWriteKilobytes) and insert a
builder.endObject() immediately after those field() calls so the JSON structure
is balanced.

Comment on lines 67 to +80
private class FsInfoCache extends SingleObjectCache<FsInfo> {
public FsInfoCache(TimeValue interval, FsInfo initValue) {
super(interval, initValue);

private final FsInfo initialValue;

public FsInfoCache(TimeValue interval, FsInfo initialValue) {
super(interval, initialValue);
this.initialValue = initialValue;
}

@Override
protected FsInfo refresh() {
try {
return probe.stats();
} catch (IOException ex) {
logger.warn("Failed to fetch fs stats - returning empty instance");
return new FsInfo(0, null);
}
return stats(probe, initialValue, logger);
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find SingleObjectCache class definition
fd -t f "SingleObjectCache" --exec cat {} \;

Repository: Signal65/elasticsearch-CodeRabbit

Length of output: 6270


🏁 Script executed:

# If not found, search for it by content
rg -t java "class SingleObjectCache" -A 20

Repository: Signal65/elasticsearch-CodeRabbit

Length of output: 4691


🏁 Script executed:

# Also look for getNoRefresh method
rg -t java "getNoRefresh"

Repository: Signal65/elasticsearch-CodeRabbit

Length of output: 59


🏁 Script executed:

# Search for the stats method to understand what the previous parameter does
rg -t java "private.*FsInfo.*stats\(" -A 10 core/src/main/java/org/elasticsearch/monitor/fs/FsService.java

Repository: Signal65/elasticsearch-CodeRabbit

Length of output: 443


🏁 Script executed:

# Find FsProbe class and its stats method
rg -t java "class FsProbe" -A 50 | head -100

Repository: Signal65/elasticsearch-CodeRabbit

Length of output: 11112


🏁 Script executed:

# Search for probe.stats method implementation
rg -t java "public.*stats\(" core/src/main/java/org/elasticsearch/monitor/fs/ -A 10

Repository: Signal65/elasticsearch-CodeRabbit

Length of output: 2317


🏁 Script executed:

# Look for FsInfo class and how deltas are calculated
rg -t java "class FsInfo" -A 30 | head -80

Repository: Signal65/elasticsearch-CodeRabbit

Length of output: 4429


Bug: refresh() always uses initialValue instead of the current cached value for delta calculations.

The refresh() method passes initialValue (the first sample) as the previous parameter on every refresh, causing IO stats deltas to be computed against the initial sample rather than the previous sample. This breaks continuous delta tracking.

To fix this, FsInfoCache needs access to the currently cached FsInfo. However, SingleObjectCache does not expose a method to retrieve the cached value without triggering a refresh, and the cached field is private. The fix requires either:

  • Adding a protected method to SingleObjectCache to access the cached value (e.g., protected T getNoRefresh())
  • Storing the previous FsInfo in FsInfoCache itself and updating it during refresh
🤖 Prompt for AI Agents
In `@core/src/main/java/org/elasticsearch/monitor/fs/FsService.java` around lines
67 - 80, The refresh() implementation in FsInfoCache is incorrectly passing the
fixed initialValue to stats(...) causing deltas to be computed against the first
sample; change the design so refresh() uses the current cached FsInfo instead of
initialValue: add a protected accessor in SingleObjectCache (e.g., protected T
getNoRefresh() or getCached()) that returns the current cached value without
triggering a refresh, then update FsInfoCache.refresh() to call stats(probe,
getNoRefresh(), logger) (or, alternatively, add a previousField in FsInfoCache
and update it on each successful refresh) and ensure any added accessor
maintains the existing concurrency/visibility guarantees of SingleObjectCache.

Comment on lines +124 to +171
`fs.io_stats.devices` (Linux only)::
Array of disk metrics for each device that is backing an
Elasticsearch data path. These disk metrics are probed periodically
and averages between the last probe and the current probe are
computed.

`fs.io_stats.devices.device_name` (Linux only)::
The Linux device name.

`fs.io_stats.devices.operations` (Linux only)::
The total number of read and write operations for the device
completed since starting Elasticsearch.

`fs.io_stats.devices.read_operations` (Linux only)::
The total number of read operations for the device completed since
starting Elasticsearch.

`fs.io_stats.devices.write_operations` (Linux only)::
The total number of write operations for the device completed since
starting Elasticsearch.

`fs.io_stats.devices.read_kilobytes` (Linux only)::
The total number of kilobytes read for the device since starting
Elasticsearch.

`fs.io_stats.devices.write_kilobytes` (Linux only)::
The total number of kilobytes written for the device since
starting Elasticsearch.

`fs.io_stats.operations` (Linux only)::
The total number of read and write operations across all devices
used by Elasticsearch completed since starting Elasticsearch.

`fs.io_stats.read_operations` (Linux only)::
The total number of read operations for across all devices used by
Elasticsearch completed since starting Elasticsearch.

`fs.io_stats.write_operations` (Linux only)::
The total number of write operations across all devices used by
Elasticsearch completed since starting Elasticsearch.

`fs.io_stats.read_kilobytes` (Linux only)::
The total number of kilobytes read across all devices used by
Elasticsearch since starting Elasticsearch.

`fs.io_stats.write_kilobytes` (Linux only)::
The total number of kilobytes written across all devices used by
Elasticsearch since starting Elasticsearch.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify io_stats semantics (delta vs cumulative/average).

The text currently mentions “averages between probes” and “since starting Elasticsearch,” but the implementation computes deltas between snapshots. Updating the wording avoids misleading users.

✏️ Suggested doc edits
-    Elasticsearch data path. These disk metrics are probed periodically
-    and averages between the last probe and the current probe are
-    computed.
+    Elasticsearch data path. These disk metrics are probed periodically
+    and deltas between the last probe and the current probe are
+    reported.

-    The total number of read and write operations for the device
-    completed since starting Elasticsearch.
+    The total number of read and write operations for the device
+    completed since the previous probe.

-    The total number of read operations for the device completed since
-    starting Elasticsearch.
+    The total number of read operations for the device completed since
+    the previous probe.

-    The total number of write operations for the device completed since
-    starting Elasticsearch.
+    The total number of write operations for the device completed since
+    the previous probe.

-    The total number of kilobytes read for the device since starting
-    Elasticsearch.
+    The total number of kilobytes read for the device since the
+    previous probe.

-    The total number of kilobytes written for the device since
-    starting Elasticsearch.
+    The total number of kilobytes written for the device since the
+    previous probe.

-    The total number of read and write operations across all devices
-    used by Elasticsearch completed since starting Elasticsearch.
+    The total number of read and write operations across all devices
+    used by Elasticsearch completed since the previous probe.

-    The total number of read operations for across all devices used by
-    Elasticsearch completed since starting Elasticsearch.
+    The total number of read operations for across all devices used by
+    Elasticsearch completed since the previous probe.

-    The total number of write operations across all devices used by
-    Elasticsearch completed since starting Elasticsearch.
+    The total number of write operations across all devices used by
+    Elasticsearch completed since the previous probe.

-    The total number of kilobytes read across all devices used by
-    Elasticsearch since starting Elasticsearch.
+    The total number of kilobytes read across all devices used by
+    Elasticsearch since the previous probe.

-    The total number of kilobytes written across all devices used by
-    Elasticsearch since starting Elasticsearch.
+    The total number of kilobytes written across all devices used by
+    Elasticsearch since the previous probe.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
`fs.io_stats.devices` (Linux only)::
Array of disk metrics for each device that is backing an
Elasticsearch data path. These disk metrics are probed periodically
and averages between the last probe and the current probe are
computed.
`fs.io_stats.devices.device_name` (Linux only)::
The Linux device name.
`fs.io_stats.devices.operations` (Linux only)::
The total number of read and write operations for the device
completed since starting Elasticsearch.
`fs.io_stats.devices.read_operations` (Linux only)::
The total number of read operations for the device completed since
starting Elasticsearch.
`fs.io_stats.devices.write_operations` (Linux only)::
The total number of write operations for the device completed since
starting Elasticsearch.
`fs.io_stats.devices.read_kilobytes` (Linux only)::
The total number of kilobytes read for the device since starting
Elasticsearch.
`fs.io_stats.devices.write_kilobytes` (Linux only)::
The total number of kilobytes written for the device since
starting Elasticsearch.
`fs.io_stats.operations` (Linux only)::
The total number of read and write operations across all devices
used by Elasticsearch completed since starting Elasticsearch.
`fs.io_stats.read_operations` (Linux only)::
The total number of read operations for across all devices used by
Elasticsearch completed since starting Elasticsearch.
`fs.io_stats.write_operations` (Linux only)::
The total number of write operations across all devices used by
Elasticsearch completed since starting Elasticsearch.
`fs.io_stats.read_kilobytes` (Linux only)::
The total number of kilobytes read across all devices used by
Elasticsearch since starting Elasticsearch.
`fs.io_stats.write_kilobytes` (Linux only)::
The total number of kilobytes written across all devices used by
Elasticsearch since starting Elasticsearch.
`fs.io_stats.devices` (Linux only)::
Array of disk metrics for each device that is backing an
Elasticsearch data path. These disk metrics are probed periodically
and deltas between the last probe and the current probe are
reported.
`fs.io_stats.devices.device_name` (Linux only)::
The Linux device name.
`fs.io_stats.devices.operations` (Linux only)::
The total number of read and write operations for the device
completed since the previous probe.
`fs.io_stats.devices.read_operations` (Linux only)::
The total number of read operations for the device completed since
the previous probe.
`fs.io_stats.devices.write_operations` (Linux only)::
The total number of write operations for the device completed since
the previous probe.
`fs.io_stats.devices.read_kilobytes` (Linux only)::
The total number of kilobytes read for the device since the
previous probe.
`fs.io_stats.devices.write_kilobytes` (Linux only)::
The total number of kilobytes written for the device since the
previous probe.
`fs.io_stats.operations` (Linux only)::
The total number of read and write operations across all devices
used by Elasticsearch completed since the previous probe.
`fs.io_stats.read_operations` (Linux only)::
The total number of read operations for across all devices used by
Elasticsearch completed since the previous probe.
`fs.io_stats.write_operations` (Linux only)::
The total number of write operations across all devices used by
Elasticsearch completed since the previous probe.
`fs.io_stats.read_kilobytes` (Linux only)::
The total number of kilobytes read across all devices used by
Elasticsearch since the previous probe.
`fs.io_stats.write_kilobytes` (Linux only)::
The total number of kilobytes written across all devices used by
Elasticsearch since the previous probe.
🤖 Prompt for AI Agents
In `@docs/reference/cluster/nodes-stats.asciidoc` around lines 124 - 171, The docs
incorrectly describe fs.io_stats fields as "averages between probes" or "since
starting Elasticsearch" but the implementation reports deltas between
consecutive snapshots; update the wording for fs.io_stats.devices and its
children (fs.io_stats.devices.device_name, .operations, .read_operations,
.write_operations, .read_kilobytes, .write_kilobytes) and the aggregate fields
(fs.io_stats.operations, fs.io_stats.read_operations,
fs.io_stats.write_operations, fs.io_stats.read_kilobytes,
fs.io_stats.write_kilobytes) to state these values represent deltas over the
probe interval (i.e., change since the previous probe), and clarify that
aggregate fields are computed from per-device deltas rather than being
cumulative since node start or averaged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants