Skip to content

relational-jdbc: add DB-agnostic idempotency store + model#3584

Open
huaxingao wants to merge 8 commits intoapache:mainfrom
huaxingao:idempotency_1_followup
Open

relational-jdbc: add DB-agnostic idempotency store + model#3584
huaxingao wants to merge 8 commits intoapache:mainfrom
huaxingao:idempotency_1_followup

Conversation

@huaxingao
Copy link
Contributor

@huaxingao huaxingao commented Jan 27, 2026

#3205 followup:

  • Introduce JDBC model for idempotency records via ModelIdempotencyRecord
  • Add database-agnostic implementation RelationalJdbcIdempotencyStore implementing IdempotencyStore.
  • Standardize core domain type location by moving IdempotencyRecord to polaris-core under org.apache.polaris.core.entity and updating IdempotencyStore imports accordingly.
  • Update tests by replacing PostgresIdempotencyStoreIT with RelationalJdbcIdempotencyStorePostgresIT

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

@huaxingao
Copy link
Contributor Author

cc @singhpk234 @flyrain Could you please take a look when you have a moment?

QueryGenerator.PreparedQuery update =
new QueryGenerator.PreparedQuery(
sql,
List.of(Timestamp.from(now), Timestamp.from(now), realmId, idempotencyKey, executorId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see insteadt of using Timestamp.now() we can use the injected Clock ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We’re not using Timestamp.now() here — the store takes Instant now as a parameter and only converts via Timestamp.from(now)

if (datasourceOperations.isConstraintViolation(e)) {
return new ReserveResult(ReserveResultType.DUPLICATE, load(realmId, idempotencyKey));
}
throw new RuntimeException("Failed to reserve idempotency key", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a defined exception ? what error code we wanna throw in case for the request ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added IdempotencyPersistenceException

* Normalized identifier of the resource affected by the operation.
*
* <p>This should be derived from the request (for example, a canonicalized path like {@code
* "tables/ns.tbl"}), not from a generated internal entity id. This ensures the binding is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we protect against drop + create and renames ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we define canonicalized path somewhere? Should we avoid duplication for a canonicalized path? A path like this tables/ns.tbl is very likely causing conflicts, as different catalogs could have the same namespace and table names.

new PostgreSQLContainer<>("postgres:17.5-alpine");
new PostgreSQLContainer<>(
containerSpecHelper("postgres", PostgresRelationalJdbcLifeCycleManagement.class)
.dockerImageName(null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why null ? can we do lastest tag ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null is intentional here: containerSpecHelper(...).dockerImageName(null) means “use the pinned default from Dockerfile-postgres-version and allow overrides via system props/env.

Comment on lines 58 to 60
public static String fullyQualifiedTableName(@Nonnull String tableName) {
return getFullyQualifiedTableName(tableName);
}
Copy link
Contributor

@singhpk234 singhpk234 Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused, why create a new public function ? can we not have private function itself made public ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we don't need this. removed.

Comment on lines 106 to 115
ModelIdempotencyRecord CONVERTER =
ImmutableModelIdempotencyRecord.builder()
.realmId("")
.idempotencyKey("")
.operationType("")
.resourceId("")
.createdAt(Instant.EPOCH)
.updatedAt(Instant.EPOCH)
.expiresAt(Instant.EPOCH)
.build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fully get the rationale of this
orthognally if its absolutely required lets make it static ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve refactored this to avoid the dummy instance by introducing a static fromRow(ResultSet) and a stateless CONVERTER singleton that delegates to it.

import org.apache.polaris.persistence.relational.jdbc.models.ImmutableModelIdempotencyRecord;
import org.apache.polaris.persistence.relational.jdbc.models.ModelIdempotencyRecord;

public class RelationalJdbcIdempotencyStore implements IdempotencyStore {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this obj created ? How is core gonna use it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RelationalJdbcIdempotencyStore is just the relational-jdbc backend for the IdempotencyStore interface. It’s constructed with a DataSource + RelationalJdbcConfiguration and then the request/idempotency layer calls reserve/load/updateHeartbeat/finalize/purgeExpired through the IdempotencyStore SPI.

Comment on lines 44 to 59
String REALM_ID = "realm_id";
String IDEMPOTENCY_KEY = "idempotency_key";
String OPERATION_TYPE = "operation_type";
String RESOURCE_ID = "resource_id";

String HTTP_STATUS = "http_status";
String ERROR_SUBTYPE = "error_subtype";
String RESPONSE_SUMMARY = "response_summary";
String RESPONSE_HEADERS = "response_headers";
String FINALIZED_AT = "finalized_at";

String CREATED_AT = "created_at";
String UPDATED_AT = "updated_at";
String HEARTBEAT_AT = "heartbeat_at";
String EXECUTOR_ID = "executor_id";
String EXPIRES_AT = "expires_at";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add code comments for each of the fields like we do in the rest of the model

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Comment on lines 122 to 174
Optional<IdempotencyRecord> existing = load(realmId, idempotencyKey);
if (existing.isEmpty()) {
return HeartbeatResult.NOT_FOUND;
}

IdempotencyRecord record = existing.get();
if (record.getHttpStatus() != null) {
return HeartbeatResult.FINALIZED;
}
if (record.getExecutorId() == null || !record.getExecutorId().equals(executorId)) {
return HeartbeatResult.LOST_OWNERSHIP;
}

QueryGenerator.PreparedQuery update =
QueryGenerator.generateUpdateQuery(
ModelIdempotencyRecord.SELECT_COLUMNS,
ModelIdempotencyRecord.TABLE_NAME,
Map.of(
ModelIdempotencyRecord.HEARTBEAT_AT,
Timestamp.from(now),
ModelIdempotencyRecord.UPDATED_AT,
Timestamp.from(now)),
Map.of(
ModelIdempotencyRecord.REALM_ID,
realmId,
ModelIdempotencyRecord.IDEMPOTENCY_KEY,
idempotencyKey,
ModelIdempotencyRecord.EXECUTOR_ID,
executorId),
Map.of(),
Map.of(),
Set.of(ModelIdempotencyRecord.HTTP_STATUS),
Set.of());

try {
int updated = datasourceOperations.executeUpdate(update);
if (updated > 0) {
return HeartbeatResult.UPDATED;
}
} catch (SQLException e) {
throw new RuntimeException("Failed to update idempotency heartbeat", e);
}

// Raced with finalize/ownership loss; re-check to return a meaningful result.
Optional<IdempotencyRecord> after = load(realmId, idempotencyKey);
if (after.isEmpty()) {
return HeartbeatResult.NOT_FOUND;
}
if (after.get().getHttpStatus() != null) {
return HeartbeatResult.FINALIZED;
}
return HeartbeatResult.LOST_OWNERSHIP;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it not need to be in a transaction ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual heartbeat change is done via a single atomic UPDATE ... WHERE realm/key/executor AND http_status IS NULL, so it doesn’t require an explicit transaction for correctness.

containerSpecHelper("postgres", PostgresRelationalJdbcLifeCycleManagement.class)
.dockerImageName(null)
.asCompatibleSubstituteFor("postgres"))
.withDatabaseName("polaris_db")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we set the foloowing in setup too, why we need here again ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@singhpk234
Copy link
Contributor

These were final set of comments :), can't think of anymore, thanks @huaxingao for working on it and driving the whole idempotency effort !

Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @huaxingao for the PR. LGTM overall. Left some comments.

* Normalized identifier of the resource affected by the operation.
*
* <p>This should be derived from the request (for example, a canonicalized path like {@code
* "tables/ns.tbl"}), not from a generated internal entity id. This ensures the binding is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we define canonicalized path somewhere? Should we avoid duplication for a canonicalized path? A path like this tables/ns.tbl is very likely causing conflicts, as different catalogs could have the same namespace and table names.

* <p>This follows the same pattern as {@link ModelEvent}, separating the storage representation
* from the core domain model while still providing {@link Converter} helpers.
*/
@PolarisImmutable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need it to be PolarisImmutable? We didn't do that for ModelEntity, and a few others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed @PolarisImmutable as suggested. That required a small follow-up refactor in RelationalJdbcIdempotencyStore.reserve(): we no longer build an ImmutableModelIdempotencyRecord just to get the insert bindings; instead we construct the insert map/values directly (same columns/values, still uses QueryGenerator.generateInsertQuery). This keeps behavior the same while avoiding the Immutables-generated type.

}

/** Returns the detected database type for this datasource. */
public DatabaseType databaseType() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this new method given that we've got a getDatabaseType() already(line 82)? We could change the scope to public if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. Thanks!


// Logical tenant / realm identifier.
String REALM_ID = "realm_id";
// Client-provided idempotency key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We got so many nice comments here while there is no comment on the class IdempotencyRecord. I'd love to have these comments on the class IdempotencyRecord.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more comments. Thanks!

Comment on lines 199 to 200
* <p>Callers should prefer passing an ordered map (e.g. {@link java.util.LinkedHashMap}) for the
* set clause so generated SQL and parameter order are stable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: feel a bit easier to understand to use consistent, or matched. Just my personal preference. Feel free to ignore it.

Suggested change
* <p>Callers should prefer passing an ordered map (e.g. {@link java.util.LinkedHashMap}) for the
* set clause so generated SQL and parameter order are stable.
* <p>Callers should prefer passing an ordered map (e.g. {@link java.util.LinkedHashMap}) for the
* set clause so generated SQL and parameter order are consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks for the suggestion!

Comment on lines 284 to 288
validateColumns(columns, whereEquals.keySet());
validateColumns(columns, whereGreater.keySet());
validateColumns(columns, whereLess.keySet());
validateColumns(columns, whereIsNull);
validateColumns(columns, whereIsNotNull);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to validate them here as the method generateWhereClauseExtended() will validates them anyways?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Removed.

Comment on lines 226 to 231
validateColumns(columns, setClause.keySet());
validateColumns(columns, whereEquals.keySet());
validateColumns(columns, whereGreater.keySet());
validateColumns(columns, whereLess.keySet());
validateColumns(columns, whereIsNull);
validateColumns(columns, whereIsNotNull);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Comment on lines 340 to 341
// Preserve the original behavior of rejecting unknown columns. This is used by SELECT query
// generation too, not only by callers of the extended UPDATE/DELETE helpers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I guess this comment is more like a dev log. Do we need them as a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@Nonnull Set<String> tableColumns, @Nonnull Set<String> columns) {
for (String column : columns) {
if (!tableColumns.contains(column) && !column.equals("realm_id")) {
throw new IllegalArgumentException("Invalid query column: " + column);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

realm_id is treated as a special implicit column for some models but explicitly included in SELECT_COLUMNS for ModelIdempotencyRecord. It might be cleaner to do something similar to other model, which keep realm_id out by only using ALL_COLUMNS instead of having SELECT_COLUMNS. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Fixed.

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM overall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants