Skip to content

[DRAFT] feat: add RegisterTable overwrite support; preserve correct auth for overwrites#3682

Open
sririshindra wants to merge 1 commit intoapache:mainfrom
sririshindra:main-issues-2896
Open

[DRAFT] feat: add RegisterTable overwrite support; preserve correct auth for overwrites#3682
sririshindra wants to merge 1 commit intoapache:mainfrom
sririshindra:main-issues-2896

Conversation

@sririshindra
Copy link

@sririshindra sririshindra commented Feb 6, 2026

Fixes #2896

Add support for the new overwrite boolean on RegisterTableRequest (default: false) so clients can register a metadata location that replaces an existing table pointer. Implement register-table overwrite semantics:
If overwrite=false (default): preserve existing behavior — attempting to register a table that already exists returns a conflict/error. If overwrite=true:
If the table does not exist: create it (normal register). If the table exists: update the table's metadata-location to the provided value (do not throw AlreadyExists). Ensure safety/backward-compatibility:
overwrite defaults to false so existing clients are unaffected. Validate provided metadata location where possible before committing to avoid corrupting catalog state.

Details / rationale:
The change implements the REST Catalog spec extension that adds an overwrite flag to RegisterTableRequest. This enables clients to atomically point an existing table identifier at a new metadata file (useful for moving or restoring table metadata). Because overwrite changes the set of operations allowed on an existing table, we had to be careful with authorization and entity resolution: When performing an overwrite against an existing table, the handler enforces the UPDATE_TABLE privilege (mapped to TABLE_WRITE_PROPERTIES) rather than REGISTER_TABLE. This prevents callers who only have create permissions from silently replacing another principal's table pointer. We fixed a subtle distinction where lack of read privileges could be mistaken for a non-existent table. The handler now distinguishes "truly not found" from "exists but unreadable" and applies the correct required privilege accordingly. Resolution manifest and passthrough-path population were adjusted so downstream overwrite logic can locate the namespace and table entries reliably. Files / components touched (high level):

Tests added/updated:

  • DTO unit tests: overwrite deserialization for RegisterTableRequest (true/false/missing).
  • Integration / behavior tests (integration-tests module):
    • Register new table with overwrite=false → success.
    • Register existing table with overwrite=false → conflict / exception as before.
    • Register existing table with overwrite=true → success and the table's metadata-location is updated atomically.
  • Authorization tests: updated/added unit tests to assert UPDATE_TABLE (TABLE_WRITE_PROPERTIES) is enforced for overwrite against existing tables.
  • Unit tests: added CatalogHandlerUtilsTest to cover handler helper logic.

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

…overwrites

Add support for the new overwrite boolean on RegisterTableRequest (default: false) so clients can register a metadata location that replaces an existing table pointer.
Implement register-table overwrite semantics:
If overwrite=false (default): preserve existing behavior — attempting to register a table that already exists returns a conflict/error.
If overwrite=true:
If the table does not exist: create it (normal register).
If the table exists: update the table's metadata-location to the provided value (do not throw AlreadyExists).
Ensure safety/backward-compatibility:
overwrite defaults to false so existing clients are unaffected.
Validate provided metadata location where possible before committing to avoid corrupting catalog state.

Details / rationale:
The change implements the REST Catalog spec extension that adds an overwrite flag to RegisterTableRequest. This enables clients to atomically point an existing table identifier at a new metadata file (useful for moving or restoring table metadata).
Because overwrite changes the set of operations allowed on an existing table, we had to be careful with authorization and entity resolution:
When performing an overwrite against an existing table, the handler enforces the UPDATE_TABLE privilege (mapped to TABLE_WRITE_PROPERTIES) rather than REGISTER_TABLE. This prevents callers who only have create permissions from silently replacing another principal's table pointer.
We fixed a subtle distinction where lack of read privileges could be mistaken for a non-existent table. The handler now distinguishes "truly not found" from "exists but unreadable" and applies the correct required privilege accordingly.
Resolution manifest and passthrough-path population were adjusted so downstream overwrite logic can locate the namespace and table entries reliably.
Files / components touched (high level):

Tests added/updated:
- DTO unit tests: overwrite deserialization for `RegisterTableRequest` (true/false/missing).
- Integration / behavior tests (integration-tests module):
  - Register new table with `overwrite=false` → success.
  - Register existing table with `overwrite=false` → conflict / exception as before.
  - Register existing table with `overwrite=true` → success and the table's metadata-location is updated atomically.
- Authorization tests: updated/added unit tests to assert `UPDATE_TABLE` (TABLE_WRITE_PROPERTIES) is enforced for overwrite against existing tables.
- Unit tests: added `CatalogHandlerUtilsTest` to cover handler helper logic.
@adutra
Copy link
Contributor

adutra commented Feb 6, 2026

Hi @sririshindra thanks for opening this PR!

Unfortunately I don't think this is the right way to bring support for overwrite in register table requests. In fact, support for this parameter seems to be missing from the Iceberg artifact. So, I went ahead and created a PR to bring support for it in Iceberg: apache/iceberg#15248.

If/when that PR is accepted, then we will be able to adapt Polaris properly to the new parameter.

GrantCatalogRoleRequest.class, new GrantCatalogRoleRequestDeserializer());
module.addDeserializer(AddGrantRequest.class, new AddGrantRequestDeserializer());
module.addDeserializer(RevokeGrantRequest.class, new RevokeGrantRequestDeserializer());
module.addDeserializer(RegisterTableRequest.class, new RegisterTableRequestDeserializer());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to conflict with org.apache.iceberg.rest.RESTSerializers.RegisterTableRequestDeserializer.


TableIdentifier identifier = TableIdentifier.of(namespace, request.name());
Table table = catalog.registerTable(identifier, request.metadataLocation());
// Determine whether the client requested overwrite semantics. For catalogs that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class was copied from Iceberg. I'm not sure it's a good idea to modify it in a divergent way. I would place the Polaris-specific logic in IcebergCatalogHandler or IcebergCatalog directly.

.setMetadataLocation(metadataFileLocation)
.build();

updateTableLike(identifier, updatedEntity);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this switching the table UUID to the new UUID, or keeping the old UUID? From this ML discussions, I think it should be the new UUID:

https://lists.apache.org/thread/b5k7vdng904zr3n3q8wv83y8l30rnd4c
https://lists.apache.org/thread/k3595bttvohb6c3ms36o16gppdfllqmp

identifier);

return catalogHandlerUtils.registerTable(baseCatalog, namespace, request);
if (!(authorizer instanceof PolarisAuthorizerImpl authorizerImpl)) {
Copy link
Contributor

@adutra adutra Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is odd, the exact authorizer being used shouldn't matter. The rest of this method feels too low-level for this class.

* across requests.
*/
public class RegisterTableRequestContext {
private static final ThreadLocal<Boolean> SHOULD_OVERWRITE = ThreadLocal.withInitial(() -> false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThreadLocals are not recommended with Quarkus because there is no guarantee that a request will be handled by a single thread for the whole request scope.

@sririshindra
Copy link
Author

Thanks @adutra for your review.
I noticed that this this parameter is missing from the Iceberg artifact, so I implemented this PR in such a way that the polaris still supports it even if it was missing from Iceberg. But, I now realize that is the wrong approach to take. Thanks for opening the PR in the Iceberg repository adding support for overwrite. Once your PR get's merged, I will update this PR based on your review comments.

I have an additional question. My understanding is that once your PR in the iceberg repository gets merged and released as part of the next Iceberg update. We would have to update https://github.com/apache/polaris/blob/main/gradle/libs.versions.toml#L23 to pick up that artifact. Correct?
So we can only merge this feature in polaris after the next Iceberg release and after polaris picks up the updated artifact.. Is my understanding correct?

@adutra
Copy link
Contributor

adutra commented Feb 6, 2026

I have an additional question. My understanding is that once your PR in the iceberg repository gets merged and released as part of the next Iceberg update. We would have to update https://github.com/apache/polaris/blob/main/gradle/libs.versions.toml#L23 to pick up that artifact. Correct?

You are correct in saying that this feature will have to wait for Iceberg 1.11.

But development can happen ahead of schedule! I just created the feature/iceberg-1.11 branch.

Once the overwrite parameter is merged in Iceberg you would be able to rebase your current work on top of that branch and open a PR against it.

Would that work for you?

@sririshindra
Copy link
Author

I have an additional question. My understanding is that once your PR in the iceberg repository gets merged and released as part of the next Iceberg update. We would have to update https://github.com/apache/polaris/blob/main/gradle/libs.versions.toml#L23 to pick up that artifact. Correct?

You are correct in saying that this feature will have to wait for Iceberg 1.11.

But development can happen ahead of schedule! I just created the feature/iceberg-1.11 branch.

Once the overwrite parameter is merged in Iceberg you would be able to rebase your current work on top of that branch and open a PR against it.

Would that work for you?

Yes, I will update my PR in the meantime. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support overwrite option in RegisterTable

2 participants