Clean up data set and data array calls from `adf_dataset.py`

The creation of the `adf_dataset` script is a great way to centralize how we gather data throughout the ADF, however, it seems to be incomplete with respect to the 3 types of files the ADF can process; time series, climo, and regridded climo files.

This script has a nice template that each of these file types can be called, but is not completed. For each of the three file types I think it would be great to finish this work by extending the load functions to be broken up like this:

`get_<file_type>_file`: gather/check files for test case(s)
`load_<file_type>_dataset`: return a data set for test case(s)
`load_<file_type>_da`: return a data array for variable for test case(s)

`get_ref_<file_type>_file`: gather/check files reference/baseline case
`load_reference_<file_type>_dataset`: return a data set for reference/baseline case
`load_reference_<file_type>_da`: return a data array for variable for reference/baseline case

This would allow the ADF to be consistent where ever data sets/arrays need to be called as well as centralize the data cleaning needed for variables, ie applying scale factors/offsets, new units, etc. For example, currently the AMWG tables are being loaded generally via xarray from `lib/adf_utils.py` `load_dataset` and are not getting the scale factors/ new units; see Issue #423


Question: would we ever want similar infrastructure to be able to work with raw history files too?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up data set and data array calls from `adf_dataset.py` #424

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clean up data set and data array calls from adf_dataset.py #424

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Clean up data set and data array calls from `adf_dataset.py` #424