Update schema projection to support initial-defaults#1644
Update schema projection to support initial-defaults#1644Fokko merged 11 commits intoapache:mainfrom
initial-defaults#1644Conversation
|
Since initial-default projection happens after filtering in _task_to_record_batches Im wondering if this will yield the correct results given a pyarrow_filter for this field. |
Thanks for pointing this out, and it doesn't handle the filtering correctly. Let me work on a fix. Thanks! |
|
No problem! I was trying to get a test case for this by evolving the schema of a table and adding a new field with some initial-default value, but i think that have to wait for V3 table spec |
|
@Fokko could you rebase this when you get a chance |
|
@kevinjqliu Sure, but I think this relies on #1770 to do some proper testing 👍 |
532b8b6 to
1653c7c
Compare
initial-defaults
…ython into fd-support-initial-value
kevinjqliu
left a comment
There was a problem hiding this comment.
Thanks for adding this feature! The PR generally LGTM. I added a few comments on reading initial-defaults for optional/required fields.
I think it would be great to add tests to cover these scenarios:
- Optional field might have initial-default set. If set, use initial-default. If not set, use null
- Required field will always have initial-default set.
There are tests covering optional field with initial-default set and required field with initial-defaultset. We just need to addoptional field without initial-default set. And perhaps a test to throw when required field does not have initial-default set`
The spec also mentions V3 data types
All columns of unknown, variant, geometry, and geography types must default to null. Non-null values for initial-default or write-default are invalid.
and nested struct types
When a field that is a struct type is added, its default may only be null or a non-null struct with no field values. Default values for fields must be stored in field metadata.
Should we address them as part of this PR?
Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com>
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM! We can address default values V3 data types and nested struct types as a follow up
Add the projection piece of the initial defaults. Closes apache#1836 --------- Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com>
Add the projection piece of the initial defaults.
Closes #1836