-
Notifications
You must be signed in to change notification settings - Fork 22
CASSANALYTICS-32 expose number of rows which violated constraints to JobStats #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
466a7dc to
fedb481
Compare
...-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/SortedSSTableWriter.java
Outdated
Show resolved
Hide resolved
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/JobInfo.java
Show resolved
Hide resolved
frankgh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I left a couple of comments. I was going to suggest adding an integration test, but Analytics currently only supports 4.0, so we won't be able to test it yet
...-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/SortedSSTableWriter.java
Outdated
Show resolved
Hide resolved
...-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/SortedSSTableWriter.java
Outdated
Show resolved
Hide resolved
| COMMIT_THREADS_PER_INSTANCE, | ||
| COMMIT_BATCH_SIZE, | ||
| SKIP_EXTENDED_VERIFY, | ||
| SKIP_ROWS_VIOLATING_CONSTRAINTS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe add a comment here that this feature is only available in Cassandra 5.0 version? Or is it 6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@frankgh thank you for the review but I am not sure this approach is still viable based on the discussion in the related ticket. Did I interpret that comment correctly when I think that what I did here is itself not enough to consider CASSANALYTICS-32 to be resolved as we need to expose failed rows somehow, but this I am trying to do here is nice to have / the first step towards that direction which is valuable already to have as such?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I saw the comment from Yifan after I reviewed the PR. I think it's a good first approach, you need to explicitly enable this feature in order to write data that does not violate the constraints.
fedb481 to
71c95c6
Compare
No description provided.