Compressed inputs and temporal outputs#878
Open
gschivley wants to merge 16 commits intoGenXProject:developfrom
Open
Compressed inputs and temporal outputs#878gschivley wants to merge 16 commits intoGenXProject:developfrom
gschivley wants to merge 16 commits intoGenXProject:developfrom
Conversation
Merge commit for Patch Release v0.4.4
Patch Release v0.4.5
Co-authored-by: gschivley <10373332+gschivley@users.noreply.github.com>
Co-authored-by: gschivley <10373332+gschivley@users.noreply.github.com>
Co-authored-by: gschivley <10373332+gschivley@users.noreply.github.com>
Co-authored-by: gschivley <10373332+gschivley@users.noreply.github.com>
Co-authored-by: gschivley <10373332+gschivley@users.noreply.github.com>
Co-authored-by: gschivley <10373332+gschivley@users.noreply.github.com>
…uet) Co-authored-by: gschivley <10373332+gschivley@users.noreply.github.com>
…utFormat setting Co-authored-by: gschivley <10373332+gschivley@users.noreply.github.com>
Co-authored-by: gschivley <10373332+gschivley@users.noreply.github.com>
…ptions Replace CSV-only file loading with DuckDB to support CSV, CSV.GZ, and Parquet formats for inputs. Add option for gzip and parquet temporal outputs.
Collaborator
Author
|
To Do:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Allow for the use of gzipped csv and parquet 1) input files and 2) temporal output files with complete backwards-compatibility. Any or all of the input files can be binary/compressed, so long as they are in the expected folder location and have the same file stem as existing inputs. Annual outputs will continue to be in csv. The format of hourly outputs is controlled with the new parameter
TemporalOutputFormat.All inputs are read using DuckDB. Gzip and parquet output formats are written using DuckDB.
This new feature is especially useful when inputs contain multiple weather years of hourly data. I've had CSV files growing to multiple GB.
What type of PR is this? (check all applicable)
Related Tickets & Documents
This partially supersedes PR #734, which has languished. I suspect it was trying to do too many things in a single PR, so I'm starting with one discrete task that is backwards compatible.
Checklist
How this can be tested
This is a strictly internal change.
Post-approval checklist for GenX core developers
After the PR is approved