(feat: persistence) Add schema-metrics-v1.sql for metrics tables (#3337)#3523
(feat: persistence) Add schema-metrics-v1.sql for metrics tables (#3337)#3523obelix74 wants to merge 7 commits intoapache:mainfrom
Conversation
persistence/relational-jdbc/src/main/resources/postgres/schema-v4.sql
Outdated
Show resolved
Hide resolved
persistence/relational-jdbc/src/main/resources/postgres/schema-v4.sql
Outdated
Show resolved
Hide resolved
|
@obelix74 @singhpk234 : WDYT about starting an RFC doc + |
|
@dimas-b there is a dev thread already please ref : https://lists.apache.org/thread/c83jnkvlwc2k3swm65cmvl4t0mt7p799 |
I am trying to solve two sets of asks from my product folks with this.
From the metrics perspective, today, with Track table scan operations:
For commit report queries:
Also many operational dashboards, and filtering by user, realm, engine name, version etc. I have not thought about roles in this flow at all, perhaps it will be useful. @singhpk234 recommended adding roles and I added them. I normalized the roles tables from a RDBMS perspective, but I didn't realize there are other similar fields stored as JSON already. |
persistence/relational-jdbc/src/main/resources/h2/schema-v4.sql
Outdated
Show resolved
Hide resolved
|
@obelix74 : please rebase to fix CI |
|
Let's hold final review until #3616 is resolved... Intermediate comments are welcome, of course :) |
5238123 to
472f6f3
Compare
…tion - Create schema-metrics-v1.sql for H2 and PostgreSQL with independent version tracking - Add --include-metrics CLI option to BootstrapCommand - Add openMetricsSchemaResource() to DatabaseType for loading metrics schema - Update JdbcMetaStoreManagerFactory to optionally load metrics schema during bootstrap This allows metrics schema to evolve independently from the entity schema.
7771396 to
65e8cb4
Compare
persistence/relational-jdbc/src/main/resources/postgres/schema-metrics-v1.sql
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/polaris/persistence/relational/jdbc/JdbcMetaStoreManagerFactory.java
Show resolved
Hide resolved
dimas-b
left a comment
There was a problem hiding this comment.
LGTM. Let's give this PR a couple more days in review for other people to comment if they want.
persistence/relational-jdbc/src/main/resources/postgres/schema-v4.sql
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/polaris/persistence/relational/jdbc/JdbcMetaStoreManagerFactory.java
Show resolved
Hide resolved
Replace junction tables (scan_metrics_report_roles, commit_metrics_report_roles) with a denormalized principal_role_ids JSON array column in both scan_metrics_report and commit_metrics_report tables. This simplifies the schema and reduces the number of tables from 5 to 3.
...lational-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/DatabaseType.java
Outdated
Show resolved
Hide resolved
persistence/relational-jdbc/src/main/resources/postgres/schema-metrics-v1.sql
Outdated
Show resolved
Hide resolved
persistence/relational-jdbc/src/main/resources/h2/schema-metrics-v1.sql
Outdated
Show resolved
Hide resolved
runtime/admin/src/main/java/org/apache/polaris/admintool/BootstrapCommand.java
Show resolved
Hide resolved
- Added unit tests for JdbcBootstrapUtils.shouldIncludeMetrics() method: - Test when schemaOptions is null (returns false) - Test when includeMetrics is true (returns true) - Test when includeMetrics is false (returns false) - Added integration tests in RelationalJdbcBootstrapCommandTest: - testBootstrapWithIncludeMetrics: verifies --include-metrics flag works - testBootstrapWithoutIncludeMetrics: verifies default behavior Per PR review comment from singhpk234
The schema-v4 version number fixes have been extracted to a separate branch (fix-schema-v4-version-number) for an independent PR. This reverts the version changes back to match main branch.
Per PR review feedback from dimas-b: Polaris supports external IdP and PDP (e.g. Keycloak and OPA), and the roles stored in metrics tables may not be aligned with AuthZ decisions. Removed principal_role_ids column from both scan_metrics_report and commit_metrics_report tables in H2 and PostgreSQL schemas.
Address PR review comment - remove namespace since tableId uniquely identifies the table.
| report_id TEXT NOT NULL, | ||
| realm_id TEXT NOT NULL, | ||
| catalog_id BIGINT NOT NULL, | ||
| namespace TEXT NOT NULL, |
There was a problem hiding this comment.
• Removed namespace TEXT NOT NULL column from both H2 and Postgres schema files
dimas-b
left a comment
There was a problem hiding this comment.
LGTM 👍
@singhpk234 : WDYT?
Add new schema version 4 with tables for storing scan and commit metrics reports as first-class entities.
New tables:
Key design decisions:
Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)