[DRAFT] feat: add RegisterTable overwrite support; preserve correct auth for overwrites#3682
[DRAFT] feat: add RegisterTable overwrite support; preserve correct auth for overwrites#3682sririshindra wants to merge 1 commit intoapache:mainfrom
Conversation
…overwrites Add support for the new overwrite boolean on RegisterTableRequest (default: false) so clients can register a metadata location that replaces an existing table pointer. Implement register-table overwrite semantics: If overwrite=false (default): preserve existing behavior — attempting to register a table that already exists returns a conflict/error. If overwrite=true: If the table does not exist: create it (normal register). If the table exists: update the table's metadata-location to the provided value (do not throw AlreadyExists). Ensure safety/backward-compatibility: overwrite defaults to false so existing clients are unaffected. Validate provided metadata location where possible before committing to avoid corrupting catalog state. Details / rationale: The change implements the REST Catalog spec extension that adds an overwrite flag to RegisterTableRequest. This enables clients to atomically point an existing table identifier at a new metadata file (useful for moving or restoring table metadata). Because overwrite changes the set of operations allowed on an existing table, we had to be careful with authorization and entity resolution: When performing an overwrite against an existing table, the handler enforces the UPDATE_TABLE privilege (mapped to TABLE_WRITE_PROPERTIES) rather than REGISTER_TABLE. This prevents callers who only have create permissions from silently replacing another principal's table pointer. We fixed a subtle distinction where lack of read privileges could be mistaken for a non-existent table. The handler now distinguishes "truly not found" from "exists but unreadable" and applies the correct required privilege accordingly. Resolution manifest and passthrough-path population were adjusted so downstream overwrite logic can locate the namespace and table entries reliably. Files / components touched (high level): Tests added/updated: - DTO unit tests: overwrite deserialization for `RegisterTableRequest` (true/false/missing). - Integration / behavior tests (integration-tests module): - Register new table with `overwrite=false` → success. - Register existing table with `overwrite=false` → conflict / exception as before. - Register existing table with `overwrite=true` → success and the table's metadata-location is updated atomically. - Authorization tests: updated/added unit tests to assert `UPDATE_TABLE` (TABLE_WRITE_PROPERTIES) is enforced for overwrite against existing tables. - Unit tests: added `CatalogHandlerUtilsTest` to cover handler helper logic.
|
Hi @sririshindra thanks for opening this PR! Unfortunately I don't think this is the right way to bring support for overwrite in register table requests. In fact, support for this parameter seems to be missing from the Iceberg artifact. So, I went ahead and created a PR to bring support for it in Iceberg: apache/iceberg#15248. If/when that PR is accepted, then we will be able to adapt Polaris properly to the new parameter. |
| GrantCatalogRoleRequest.class, new GrantCatalogRoleRequestDeserializer()); | ||
| module.addDeserializer(AddGrantRequest.class, new AddGrantRequestDeserializer()); | ||
| module.addDeserializer(RevokeGrantRequest.class, new RevokeGrantRequestDeserializer()); | ||
| module.addDeserializer(RegisterTableRequest.class, new RegisterTableRequestDeserializer()); |
There was a problem hiding this comment.
This is going to conflict with org.apache.iceberg.rest.RESTSerializers.RegisterTableRequestDeserializer.
|
|
||
| TableIdentifier identifier = TableIdentifier.of(namespace, request.name()); | ||
| Table table = catalog.registerTable(identifier, request.metadataLocation()); | ||
| // Determine whether the client requested overwrite semantics. For catalogs that |
There was a problem hiding this comment.
This class was copied from Iceberg. I'm not sure it's a good idea to modify it in a divergent way. I would place the Polaris-specific logic in IcebergCatalogHandler or IcebergCatalog directly.
| .setMetadataLocation(metadataFileLocation) | ||
| .build(); | ||
|
|
||
| updateTableLike(identifier, updatedEntity); |
There was a problem hiding this comment.
Is this switching the table UUID to the new UUID, or keeping the old UUID? From this ML discussions, I think it should be the new UUID:
https://lists.apache.org/thread/b5k7vdng904zr3n3q8wv83y8l30rnd4c
https://lists.apache.org/thread/k3595bttvohb6c3ms36o16gppdfllqmp
| identifier); | ||
|
|
||
| return catalogHandlerUtils.registerTable(baseCatalog, namespace, request); | ||
| if (!(authorizer instanceof PolarisAuthorizerImpl authorizerImpl)) { |
There was a problem hiding this comment.
This is odd, the exact authorizer being used shouldn't matter. The rest of this method feels too low-level for this class.
| * across requests. | ||
| */ | ||
| public class RegisterTableRequestContext { | ||
| private static final ThreadLocal<Boolean> SHOULD_OVERWRITE = ThreadLocal.withInitial(() -> false); |
There was a problem hiding this comment.
ThreadLocals are not recommended with Quarkus because there is no guarantee that a request will be handled by a single thread for the whole request scope.
|
Thanks @adutra for your review. I have an additional question. My understanding is that once your PR in the iceberg repository gets merged and released as part of the next Iceberg update. We would have to update https://github.com/apache/polaris/blob/main/gradle/libs.versions.toml#L23 to pick up that artifact. Correct? |
You are correct in saying that this feature will have to wait for Iceberg 1.11. But development can happen ahead of schedule! I just created the Once the overwrite parameter is merged in Iceberg you would be able to rebase your current work on top of that branch and open a PR against it. Would that work for you? |
Yes, I will update my PR in the meantime. Thank you! |
Fixes #2896
Add support for the new overwrite boolean on RegisterTableRequest (default: false) so clients can register a metadata location that replaces an existing table pointer. Implement register-table overwrite semantics:
If overwrite=false (default): preserve existing behavior — attempting to register a table that already exists returns a conflict/error. If overwrite=true:
If the table does not exist: create it (normal register). If the table exists: update the table's metadata-location to the provided value (do not throw AlreadyExists). Ensure safety/backward-compatibility:
overwrite defaults to false so existing clients are unaffected. Validate provided metadata location where possible before committing to avoid corrupting catalog state.
Details / rationale:
The change implements the REST Catalog spec extension that adds an overwrite flag to RegisterTableRequest. This enables clients to atomically point an existing table identifier at a new metadata file (useful for moving or restoring table metadata). Because overwrite changes the set of operations allowed on an existing table, we had to be careful with authorization and entity resolution: When performing an overwrite against an existing table, the handler enforces the UPDATE_TABLE privilege (mapped to TABLE_WRITE_PROPERTIES) rather than REGISTER_TABLE. This prevents callers who only have create permissions from silently replacing another principal's table pointer. We fixed a subtle distinction where lack of read privileges could be mistaken for a non-existent table. The handler now distinguishes "truly not found" from "exists but unreadable" and applies the correct required privilege accordingly. Resolution manifest and passthrough-path population were adjusted so downstream overwrite logic can locate the namespace and table entries reliably. Files / components touched (high level):
Tests added/updated:
RegisterTableRequest(true/false/missing).overwrite=false→ success.overwrite=false→ conflict / exception as before.overwrite=true→ success and the table's metadata-location is updated atomically.UPDATE_TABLE(TABLE_WRITE_PROPERTIES) is enforced for overwrite against existing tables.CatalogHandlerUtilsTestto cover handler helper logic.Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)