-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: do not cleanup 503s for CREATE transaction #15051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Core: do not cleanup 503s for CREATE transaction #15051
Conversation
| .build(); | ||
| requirements = UpdateRequirements.forCreateTable(updates); | ||
| errorHandler = ErrorHandlers.tableErrorHandler(); // throws NoSuchTableException | ||
| errorHandler = ErrorHandlers.tableCommitHandler(); // throws NoSuchTableException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please add some tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on a first glance this seems correct to me but you'll most likely also need to update
diff --git a/core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java b/core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java
index c2fd24856f..057a889bec 100644
--- a/core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java
+++ b/core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java
@@ -2182,8 +2182,10 @@ public abstract class CatalogTests<C extends Catalog & SupportsNamespaces> {
supportsServerSideRetry()
? "Requirement failed: table already exists"
: "Table already exists";
+ Class<? extends RuntimeException> expectedException =
+ supportsServerSideRetry() ? AlreadyExistsException.class : CommitFailedException.class;
assertThatThrownBy(create::commitTransaction)
- .isInstanceOf(AlreadyExistsException.class)
+ .isInstanceOf(expectedException)
.hasMessageStartingWith(expectedMessage);
// validate the concurrently created table is unmodified
@@ -2434,8 +2436,10 @@ public abstract class CatalogTests<C extends Catalog & SupportsNamespaces> {
supportsServerSideRetry()
? "Requirement failed: table already exists"
: "Table already exists";
+ Class<? extends RuntimeException> expectedException =
+ supportsServerSideRetry() ? AlreadyExistsException.class : CommitFailedException.class;
assertThatThrownBy(createOrReplace::commitTransaction)
- .isInstanceOf(AlreadyExistsException.class)
+ .isInstanceOf(expectedException)
.hasMessageStartingWith(expectedMessage);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about 409?
I was not sure if in this case it was ok to return a CommitFailedException instead of a TableAlreadyExists exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think we may just want to add a createTableErrorHandler() which extends the commit error handler, but we override the 409 behavior to be a TableAlreadyExists exception, and then in any other case fallback to super.accept(error).
ca80b55 to
0b83a63
Compare
amogh-jahagirdar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @alessandro-nori , just had some comments on the test, but fundamentally looks great to me. Thank you for fixing this!
| RESTCatalogAdapter adapter = | ||
| Mockito.spy( | ||
| new RESTCatalogAdapter(backendCatalog) { | ||
| @Override | ||
| protected <T extends RESTResponse> T execute( | ||
| HTTPRequest request, | ||
| Class<T> responseType, | ||
| Consumer<ErrorResponse> errorHandler, | ||
| Consumer<Map<String, String>> responseHeaders) { | ||
| if (request.method() == HTTPMethod.POST && request.path().contains("some_table")) { | ||
| // Simulate a 503 Service Unavailable error | ||
| ErrorResponse error = | ||
| ErrorResponse.builder() | ||
| .responseCode(503) | ||
| .withMessage("Service unavailable") | ||
| .build(); | ||
|
|
||
| errorHandler.accept(error); | ||
| throw new IllegalStateException("Error handler should have thrown"); | ||
| } | ||
| return super.execute(request, responseType, errorHandler, responseHeaders); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should be able to slim this setup down a bit by just doing something like:
RESTCatalogAdapter adapter = Mockito.spy(new RESTCatalogAdapter(backendCatalog));
Mockito.doThrow(new ServiceFailureException("some service failure"))
.when(adapter)
.execute(reqMatcher(HTTPMethod.POST), any(), any(), any());
Think it's OK to mock the commit state unknown here, since all we're trying to test is the client's reaction to it on create? I see the other tests in this class do that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your review.
My goal was to test that the new ErrorHandler is correctly mapping 503 errors to CommitStateUnknownException.
I tried debugging what you suggested but the ErrorHandler code is not called in that case. There is already a test to check that not CleanableFailures like ServiceFailureException do not delete files
iceberg/core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java
Lines 2505 to 2544 in 0b83a63
| public void testNoCleanupForNonCleanableCreateTransaction() { | |
| RESTCatalogAdapter adapter = Mockito.spy(new RESTCatalogAdapter(backendCatalog)); | |
| RESTCatalog catalog = catalog(adapter); | |
| if (requiresNamespaceCreate()) { | |
| catalog.createNamespace(TABLE.namespace()); | |
| } | |
| catalog.createTable(TABLE, SCHEMA); | |
| TableIdentifier newTable = TableIdentifier.of(TABLE.namespace(), "some_table"); | |
| Mockito.doThrow(new ServiceFailureException("some service failure")) | |
| .when(adapter) | |
| .execute(reqMatcher(HTTPMethod.POST, RESOURCE_PATHS.table(newTable)), any(), any(), any()); | |
| Transaction createTableTransaction = catalog.newCreateTableTransaction(newTable, SCHEMA); | |
| createTableTransaction.newAppend().appendFile(FILE_A).commit(); | |
| assertThatThrownBy(createTableTransaction::commitTransaction) | |
| .isInstanceOf(ServiceFailureException.class) | |
| .hasMessage("some service failure"); | |
| assertThat(allRequests(adapter)) | |
| .anySatisfy( | |
| req -> { | |
| assertThat(req.method()).isEqualTo(HTTPMethod.POST); | |
| assertThat(req.path()).isEqualTo(RESOURCE_PATHS.table(newTable)); | |
| assertThat(req.body()).isInstanceOf(UpdateTableRequest.class); | |
| UpdateTableRequest body = (UpdateTableRequest) req.body(); | |
| Optional<MetadataUpdate> appendSnapshot = | |
| body.updates().stream() | |
| .filter(update -> update instanceof MetadataUpdate.AddSnapshot) | |
| .findFirst(); | |
| assertThat(appendSnapshot).isPresent(); | |
| MetadataUpdate.AddSnapshot addSnapshot = | |
| (MetadataUpdate.AddSnapshot) appendSnapshot.get(); | |
| String manifestListLocation = addSnapshot.snapshot().manifestListLocation(); | |
| assertThat(catalog.loadTable(TABLE).io().newInputFile(manifestListLocation).exists()) | |
| .isTrue(); | |
| }); | |
| } |
My new test instead fails without the new error handler.
I don't have enough expertise in Java so it's very possible I'm doing something wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is my bad, I had to look at the code further. You're right, given how the http client is invoked, to trigger a failure and go through the error handler, we can't just mock execute. The error handler invocation is handled below the execution API. I think this is fine, maybe we want a separate helper method for better simulation of errors that go through the error handlers.
On a separate note, I'm starting to think a bit more about how we look at CleanableFailure. The original intent of the CleanableFailure marker interface was to prevent issues in case arbitrary exceptions happened to be thrown on the commit path. We generally expect Error handlers to map to the commit state unknown, and any other exception we fail to handle goes through as a runtime exception, which we validate is cleanable or not. Anyways this is separate.
Minor point, I'd just shorten the test name a bit: testNoCleanupOnCreate503 or something. The commit state unknown is more of an internal handling detail for this case.
| Optional<MetadataUpdate> appendSnapshot = | ||
| body.updates().stream() | ||
| .filter(update -> update instanceof MetadataUpdate.AddSnapshot) | ||
| .findFirst(); | ||
| assertThat(appendSnapshot).isPresent(); | ||
|
|
||
| MetadataUpdate.AddSnapshot addSnapshot = | ||
| (MetadataUpdate.AddSnapshot) appendSnapshot.get(); | ||
| String manifestListLocation = addSnapshot.snapshot().manifestListLocation(); | ||
| // Files should still exist because we don't know if commit succeeded | ||
| assertThat(catalog.loadTable(TABLE).io().newInputFile(manifestListLocation).exists()) | ||
| .isTrue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a more assertJ "fluent" way:
assertThat(body.updates().stream()
.filter(MetadataUpdate.AddSnapshot.class::isInstance)
.map(MetadataUpdate.AddSnapshot.class::cast)
.findFirst())
.hasValueSatisfying(addSnapshot -> {
String manifestListLocation = addSnapshot.snapshot().manifestListLocation();
assertThat(catalog.loadTable(TABLE).io().newInputFile(manifestListLocation).exists())
.isTrue();
});
| } | ||
|
|
||
| /** Table create error handler */ | ||
| private static class CreateTableErrorHandler extends TableErrorHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also fine, just wondering it may be a smaller diff if we just extended CommitErrorHandler, and overrode the 409 case? Unless I'm missing a case in handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need to override 404 because it should be NoSuchNamespaceException for create-table instead of NoSuchTableException.
I like your approach better because CreateTableErrorHandler should be used in case of a commit.
I'm also fine with reverting 26d4dda in case we change our mind
amogh-jahagirdar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this LGTM, just a comment on the test name. Thank you @alessandro-nori this is an important fix! Let's give @nastra @singhpk234 and others time to review as well.
| RESTCatalogAdapter adapter = | ||
| Mockito.spy( | ||
| new RESTCatalogAdapter(backendCatalog) { | ||
| @Override | ||
| protected <T extends RESTResponse> T execute( | ||
| HTTPRequest request, | ||
| Class<T> responseType, | ||
| Consumer<ErrorResponse> errorHandler, | ||
| Consumer<Map<String, String>> responseHeaders) { | ||
| if (request.method() == HTTPMethod.POST && request.path().contains("some_table")) { | ||
| // Simulate a 503 Service Unavailable error | ||
| ErrorResponse error = | ||
| ErrorResponse.builder() | ||
| .responseCode(503) | ||
| .withMessage("Service unavailable") | ||
| .build(); | ||
|
|
||
| errorHandler.accept(error); | ||
| throw new IllegalStateException("Error handler should have thrown"); | ||
| } | ||
| return super.execute(request, responseType, errorHandler, responseHeaders); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is my bad, I had to look at the code further. You're right, given how the http client is invoked, to trigger a failure and go through the error handler, we can't just mock execute. The error handler invocation is handled below the execution API. I think this is fine, maybe we want a separate helper method for better simulation of errors that go through the error handlers.
On a separate note, I'm starting to think a bit more about how we look at CleanableFailure. The original intent of the CleanableFailure marker interface was to prevent issues in case arbitrary exceptions happened to be thrown on the commit path. We generally expect Error handlers to map to the commit state unknown, and any other exception we fail to handle goes through as a runtime exception, which we validate is cleanable or not. Anyways this is separate.
Minor point, I'd just shorten the test name a bit: testNoCleanupOnCreate503 or something. The commit state unknown is more of an internal handling detail for this case.
Fixes #15050