Skip to content

feat(metrics): Evolve PolarisMetricsReporter interface with timestamp parameter and comprehensive documentation#3468

Merged
dimas-b merged 16 commits intoapache:mainfrom
obelix74:feat/enable-metrics-event-emission
Jan 23, 2026
Merged

feat(metrics): Evolve PolarisMetricsReporter interface with timestamp parameter and comprehensive documentation#3468
dimas-b merged 16 commits intoapache:mainfrom
obelix74:feat/enable-metrics-event-emission

Conversation

@obelix74
Copy link
Contributor

@obelix74 obelix74 commented Jan 17, 2026

This PR enhances the PolarisMetricsReporter SPI interface by adding a timestamp parameter to the reportMetric() method, enabling accurate time-series metrics reporting to external systems.

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

This commit adds the core infrastructure for emitting metrics as events
when reportMetrics() is called on the Iceberg REST catalog API.

Changes:
- Add REPORT_METRICS_REQUEST attribute to EventAttributes.java
- Add BEFORE_REPORT_METRICS and AFTER_REPORT_METRICS to PolarisEventType.java
- Update reportMetrics() in IcebergRestCatalogEventServiceDelegator.java
  to emit BEFORE/AFTER events with catalog name, namespace, table, and request
- Add ReportMetricsEventTest.java with unit tests verifying event emission

This enables event listeners to receive metrics report events, allowing for
use cases like audit logging and metrics persistence.

Added tests and a Feature flag
@obelix74
Copy link
Contributor Author

@dimas-b Created this PR for the event emission and added a feature flag for backwards compatibility.

@adutra
Copy link
Contributor

adutra commented Jan 19, 2026

@dimas-b Created this PR for the event emission and added a feature flag for backwards compatibility.

I am not sure backwards compatibility is important here as currently no listener bundled with Polaris actually processes the metrics-related events.

Anand Kumar Sankaran added 3 commits January 19, 2026 09:31
This commit removes the event-based metrics reporting system and introduces
a new MetricsProcessor interface with CDI support. This is the foundation
for a simpler, more direct metrics processing architecture.

Changes:
- Remove ENABLE_METRICS_EVENT_EMISSION feature flag
- Remove BEFORE_REPORT_METRICS and AFTER_REPORT_METRICS event types
- Remove REPORT_METRICS_REQUEST event attribute
- Remove event emission from IcebergRestCatalogEventServiceDelegator.reportMetrics()
- Remove ReportMetricsEventTest

- Add MetricsProcessor interface for processing metrics reports
- Add MetricsProcessingContext with rich contextual information
  (realm ID, principal, request ID, OpenTelemetry trace context)
- Add MetricsProcessorConfiguration for type-safe configuration
- Add CDI producer in ServiceProducers for MetricsProcessor

The new MetricsProcessor interface provides:
- Simpler, more direct processing (no events)
- Rich context with realm, principal, request ID, OTel trace
- CDI-based extensibility via @Identifier annotations
- Type-safe configuration

Implementations will be added in subsequent PRs. This commit provides
the foundational interfaces and CDI infrastructure.

Backward compatibility: The existing PolarisMetricsReporter interface
and configuration remain unchanged and functional.
- Remove ENABLE_METRICS_EVENT_EMISSION feature flag entry
- Add polaris.metrics.processor.type configuration property
@obelix74
Copy link
Contributor Author

obelix74 commented Jan 19, 2026

@adutra and @dimas-b thanks for all the feedback. Indeed, @singhpk234 also did not like the event based approach. I have made the following changes to address your feedback.

  • Introduces a new MetricsProcessor interface with CDI support. This establishes the foundation for a simpler, more direct metrics processing architecture. I have intentionally left out PolarisMetricsReporter.

  • MetricsProcessingContext - Rich context object providing:
    • Catalog and table information
    • Realm ID and principal name
    • Request ID for correlation
    • OpenTelemetry trace ID and span ID
    • Timestamp
    • MetricsProcessorConfiguration - Type-safe configuration using Quarkus @ConfigMapping
    • CDI producer in ServiceProducers for MetricsProcessor selection via @Identifier annotations

The existing PolarisMetricsReporter interface and polaris.iceberg-metrics.reporting.type configuration remain unchanged and functional.

What's Not Included

This PR intentionally excludes implementations to keep the scope focused:
• No processor implementations (noop, logging, persistence)
• No integration with IcebergCatalogAdapter
• No backward compatibility adapters

These will be added in subsequent PRs.

Configuration

   polaris:
     metrics:
       processor:
         type: noop  # Default - implementations will be added in follow-up PRs

Next Steps

Follow-up PRs will add:

  1. Built-in processor implementations (noop, logging, persistence)
  2. Integration with the REST catalog adapter
  3. Backward compatibility layer for existing PolarisMetricsReporter users

I am open to removing PolarisMetricsReporter - but need to figure out how to migrate existing code and users. In fact, I just upgraded my Polaris installation to 1.3.0 to use the metrics logging output while we sort out these.

@obelix74 obelix74 changed the title feat(events): Enable metrics event emission in reportMetrics() feat(events): New metrics reporter interface with multiple strategies Jan 19, 2026
@obelix74 obelix74 requested review from adutra and dimas-b January 19, 2026 17:56
@obelix74
Copy link
Contributor Author

Perhaps I should keep the polaris.iceberg-metrics.reporting.type property and not introduce a new property. That way we can delete any existing code and not break anyone else. I think I was trying to be too conservative in introducing a new property, but that adds more work. @adutra what do you think?

@obelix74 obelix74 changed the title feat(events): New metrics reporter interface with multiple strategies feat(metrics): New metrics reporter interface with multiple strategies Jan 19, 2026
@obelix74 obelix74 requested a review from dimas-b January 19, 2026 23:34
@obelix74 obelix74 requested a review from dimas-b January 20, 2026 00:00
@obelix74 obelix74 changed the title feat(metrics): New metrics reporter interface with multiple strategies feat(metrics): Evolve PolarisMetricsReporter interface with timestamp parameter and comprehensive documentation Jan 20, 2026
@obelix74 obelix74 requested a review from adutra January 20, 2026 18:58
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of remaining minor comments. Thanks for bearing with me, @obelix74 !

@adutra : WDYT?

@obelix74 obelix74 requested a review from dimas-b January 20, 2026 20:48
Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well, thank you @obelix74 !

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jan 21, 2026
dimas-b
dimas-b previously approved these changes Jan 21, 2026
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Thanks again, @obelix74 !

Let's give this PR another day in review in case other people have comments.

CC: @cccs-cat001

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @obelix74 for the change it mostly LGTM too !

* @see MetricsReportingConfiguration
*/
public interface PolarisMetricsReporter {
public void reportMetric(String catalogName, TableIdentifier table, MetricsReport metricsReport);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we release this public interface in 1.3 right ? i wonder if we keep this api with default impl to the new method with null or something ?
361b7e9

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@singhpk234 I had a deprecated backward compatible method and the consensus was to remove it since this SPI is not meant to be used by developers outside Polaris.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't fully get it, wdym by developers outside Polaris ? can this not be used downstream projects, my recommendation is to make sure version upgrades are seemless so keep this and add more if we want and just update the default of this work under timestamp null assumption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me also check previous discussions, can you please point me to that ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@dimas-b dimas-b Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our standing evolution guidelines public classes / methods can change in any release. Unlike REST API changes, it is not considered a "major" change for versioning purposes. Essentially java methods are not part of the API surface in the SemVer sense.

We should and do try to make java API changes in a backward -compatible manner when practical.

Regarding this particular PR and java interfaces that are part of an SPI (defined and called by Polaris, implemented by 3rd party plugins), I do not see a practical way to evolve them in a backward-compatible manner without causing excessive maintenance burden in Polaris code.

In this case, Polaris would have to define a new interface for the new method signature and perform instanceof checks at runtime in order to decide whether to call the old or the new method. I do believe it would be an overkill at the current stage of the project (still evolving actively). It is not too hard for downstream implementations to adjust code for the newly added parameter.

That said, I'd like to propose moving the SPI evolution discussion to the dev ML as it is a big and complex topic.

@singhpk234 : please clarify whether you consider this comment thread a blocker for merging (or not).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b can we not add the api in the same interface and add the default impl of old api call this null clock, i would like to understand this more if this is possible. If its not possible i am supportive of the change !

@singhpk234 : please clarify whether you consider this comment thread a blocker for merging (or not).

thanks for asking the clarification i believe anything is thats not marked as 'nit' / 'not blocker' / 'not' is expected to be resolved before merging (per here) , i really appreciate you checking in !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not add the api in the same interface and add the default impl of old api call this null clock [...]

I do not see a point in adding a default impl for the old method alone. Existing implementations will have overrides for it already (javac will not let them miss that).

We could add a default impl. for the new method and redirect to the old one (without the Instant). This will allow existing implementations to compile without changes. However, this creates uncertainty for the implementor regarding which method should do the real work. Javadoc helps, but adds cognitive load. We could add default to both methods and deprecate the old one, but this will add "cruft" to the interface definition, when, I believe, adapting to the new interface in downstream projects is very easy.

Please note that providing a custom implementation for this interface requires a downstream build. So all upgrades will go through local builds and CI (assuming normal software engineering practices), where the need for adjustments will be become apparent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I uptook 1.3.0, I was glad that my compilation failed since if the old method was deprecated, I would be confused why I need to uptake the new approach and how to do it.

Having said that, this PR started as adding events for metrics, we removed it. It only contains this extra attribute. It is not a blocker for me. If we don't agree on how to proceed, I would rather abandon this PR and focus on the simplified metrics persistence code here: #3385. That allows Polaris to persist table metrics to the database.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats fair, though in this case we are forcing the upgrades to have this new interface implementation when having the old api was not hurting ? taking the case LOGGERs can be configured to implicitly log timestamp additionally and the way we have reporting wired if i wanna do custom one can just do timestamp internally rather than requesting from the signature ?

   metricsReporter.reportMetric(
        catalogName, tableIdentifier, reportMetricsRequest.report(), clock.instant());

it not the time when the request hit the server but its the time when the reportMetric is called so does it matter of i do this vs

reportMetric(catalogName, tableIdentifier, reportMetricsRequest.report())) {
timestamp = clock.instant()
}

with that being said i think its fine if we wanna move forward :), but above is my thought process. Please move forward @dimas-b trust your judgement here !

CHANGELOG.md Outdated
- The EclipseLink Persistence implementation has been completely removed.
- The default request ID header name has changed from `Polaris-Request-Id` to `X-Request-ID`.
- The (Before/After)CommitTableEvent has been removed.
- The `PolarisMetricsReporter.reportMetric()` method signature has been extended to include a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i belive we need to freeze the changelog for 1.3 since its released and then add this in an unrelease section ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will move it to unreleased section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@singhpk234 I don't see a 1.3.0 section in the CHANGELOG - I synced my fork to upstream.

https://github.com/apache/polaris/blob/main/CHANGELOG.md

Am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we need to rename Unreleased to 1.3.0-incubating in CHANGELOG and move this line alone to Unreleased.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no you are not ideally there needs to a update in the main branch to freeze 1.3 section and create the unreleased section as soon as a version of Polaris is released, seems like we need to do it first ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as the CHANGELOG "diff" in this PR is concerned, it looks correct to me.

I'd expect @pingtimeout (as the release manager for 1.3.0) to open a reconciliation PR for main and add a dedicated section for 1.3.0 soon 😉 I'm sure @pingtimeout will be able to deal with conflicts if this PR merges first 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@obelix74 : please do not add a 1.3.0 section to CHANGELOG in this PR :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b Noted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cf. #3503

Copy link
Contributor

@cccs-cat001 cccs-cat001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! I appreciate the improvement :)

adutra
adutra previously approved these changes Jan 22, 2026
@dimas-b
Copy link
Contributor

dimas-b commented Jan 22, 2026

It looks like we have consensus in all discussions and some approvals. If no new concerns are raised. I propose to merge on Jan 23.

@obelix74
Copy link
Contributor Author

Thanks all.

@dimas-b
Copy link
Contributor

dimas-b commented Jan 23, 2026

@obelix74 : unfortunately, it looks like there's a conflict on CHANGELOG.md now... Could you resolve it?

@obelix74 obelix74 dismissed stale reviews from adutra and dimas-b via bd15dcd January 23, 2026 15:48
@obelix74
Copy link
Contributor Author

@obelix74 : unfortunately, it looks like there's a conflict on CHANGELOG.md now... Could you resolve it?

Resolved.

@dimas-b dimas-b merged commit d36e88e into apache:main Jan 23, 2026
15 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Jan 23, 2026
evindj pushed a commit to evindj/polaris that referenced this pull request Jan 26, 2026
… parameter and comprehensive documentation (apache#3468)

Enhance the `PolarisMetricsReporter` SPI interface by adding a timestamp parameter to the `reportMetric()` method, enabling accurate time-series metrics reporting to external systems.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants