-
Notifications
You must be signed in to change notification settings - Fork 3k
Open
Labels
improvementPR that improves existing functionalityPR that improves existing functionality
Description
Feature Request / Improvement
Core: HadoopFileIO to take list of filesystem schemas to enable trash for
#14501 uses trash policy when the target path resolves to local or hdfs, which it does by looking at the classname of the FS instance.
That
- Breaks distributions which don't have hdfs on the classpath (for example Azure HD/Insight)
- Adds the overhead of instantiating Trash policies on every single delete
- Doesn't allow trash to be applied to other filesystems, or disapplied to localfs (which doesn't have any remote cleaner in the background).
Proposed:
- add option
iceberg.hadoop.trash.schemasto take a list of filesystems, (defaults "hdfs" and "viewfs") and only apply if there's a match - add test which makes file the schema, verifies it can be enabled/disabled.
- document. (where?). Maybe also update SupportsBulkOperations javadocs to mention deletion may be replaced by trash.
Also, need to restore semantics "deletePath(missing) doesn't raise an exception".
Query engine
None
Willingness to contribute
- I can contribute this improvement/feature independently
- I would be willing to contribute this improvement/feature with guidance from the Iceberg community
- I cannot contribute this improvement/feature at this time
Metadata
Metadata
Assignees
Labels
improvementPR that improves existing functionalityPR that improves existing functionality