Skip to content

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
#SBATCH --array=2011-2013 # set time step to process

# Set the source directory containing year folders
SOURCE_DIR="/caldera/hovenweep/projects/usgs/water/impd/hytest/niwaa_wrfhydro_monthly_huc12_aggregations_sample_data/CHRTOUT"
SOURCE_DIR="/path/to/CHRTOUT"

# Load necessary modules
module load nco
Expand All @@ -34,7 +34,7 @@ echo "Job started at $(date)"

#run the temporal aggregation script

srun /path/to/repo/hytest/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/nco_process_chrtout.sh $SLURM_ARRAY_TASK_ID
srun /path/to/shell/script/nco_process_chrtout.sh $SLURM_ARRAY_TASK_ID

# Record the end time
global_end=$(date +%s)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
#SBATCH --array=2011-2013 # set time step to process

# Set the source directory containing year folders
SOURCE_DIR="/caldera/hovenweep/projects/usgs/water/impd/hytest/niwaa_wrfhydro_monthly_huc12_aggregations_sample_data/GWOUT"
SOURCE_DIR="/path/to/GWOUT"

# Load necessary modules
module load nco
Expand All @@ -34,7 +34,7 @@ echo "Job started at $(date)"

#Run the temporal aggregation

srun /path/to/repo/hytest/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/nco_process_gwout.sh $SLURM_ARRAY_TASK_ID
srun /path/to/shell/script/nco_process_gwout.sh $SLURM_ARRAY_TASK_ID

# Record the end time
global_end=$(date +%s)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
#SBATCH --array=2011-2013 # set time step to process

# Set the source directory containing year folders
SOURCE_DIR="/caldera/hovenweep/projects/usgs/water/impd/hytest/niwaa_wrfhydro_monthly_huc12_aggregations_sample_data/LDASIN"
SOURCE_DIR="/path/to/LDASIN"

# Load necessary modules
module load nco
Expand All @@ -34,7 +34,7 @@ echo "Job started at $(date)"

#Run the temporal aggregation

srun /path/to/repo/hytest/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/nco_process_ldasin.sh $SLURM_ARRAY_TASK_ID
srun /path/to/shell/script/nco_process_ldasin.sh $SLURM_ARRAY_TASK_ID


# Record the end time
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
#SBATCH --array=2011-2013 # set time step to process

# Set the source directory containing year folders
SOURCE_DIR="/caldera/hovenweep/projects/usgs/water/impd/hytest/niwaa_wrfhydro_monthly_huc12_aggregations_sample_data/LDASOUT"
SOURCE_DIR="/path/to/LDASOUT"

# Load necessary modules
module load nco
Expand All @@ -34,7 +34,7 @@ echo "Job started at $(date)"

#Run the temporal aggregation

srun /path/to/repo/hytest/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/nco_process_ldasout.sh $SLURM_ARRAY_TASK_ID
srun /path/to/shell/script/nco_process_ldasout.sh $SLURM_ARRAY_TASK_ID


# Record the end time
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
# year to process
# e.g., ./nco_process_chrtout.sh 2009
# Developed: 06/11/2024, A. Dugger
# Updated: 4/7/2025, L. Staub
# Updated: 7/15/2025, L. Staub
# ###########################################################################

# ###########################################################################
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@
"con.print(f'outDir exists: {outDir.is_dir()}')\n",
"\n",
"# Basename for output files - extension will be applied later\n",
"output_pattern = 'CONUS_HUC12_2D_20111001_20120930'\n",
"output_pattern = 'CONUS_HUC12_2D_WY2011_2013'\n",
"\n",
"# Other variables to help with the file output naming convention\n",
"write_CSV = True\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@
"outDir = r'/path/to/outputs/agg_out'\n",
"\n",
"# Output filename pattern\n",
"output_pattern = 'CONUS_HUC12_1D_2011001_20120930'\n",
"output_pattern = 'CONUS_HUC12_1D_WY2011_2013'\n",
"\n",
"# Select output formats\n",
"write_NC = True # Output netCDF file\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,12 @@ Tracking computation times for a 3-year subset of WRF-Hydro modeling application

| **Script** | **Description** | **Datasets processed** | **Dask** | **Completion Time** | **Output** |
| ------ | ------ | ------ | ------ | ------ | ------ |
| 01_2D_spatial_aggregation | Aggregation to HUC12s of 2-Dimensional variables | monthly LDASOUT & LDASIN | Yes | 2 hours | CONUS_HUC12_2D_20111001_20120930.nc |
| 02_1D_spatial_aggregation | Aggregation to HUC12s of 1-Dimensional variables | monthly GWOUT & CHRTOUT | No | 2.5 hours | CONUS_HUC12_1D_2011001_20120930.nc |
| 01_2D_spatial_aggregation | Aggregation to HUC12s of 2-Dimensional variables | monthly LDASOUT & LDASIN | Yes | 2 hours | CONUS_HUC12_2D_WY2011_2013.nc |
| 02_1D_spatial_aggregation | Aggregation to HUC12s of 1-Dimensional variables | monthly GWOUT & CHRTOUT | No | 2.5 hours | CONUS_HUC12_1D_WY2011_2013.nc |
| usgs_common | python script containg functions used in aggregation | --- | No | --- | --- |

## Compute Environment Needs
Users will need to create and activate a conda environment using the [wrfhydro_huc12_agg.yml](wrfhydro_huc12_agg.yml) file to run the python script and notebooks. For this environment to work, the latest version of Miniforge should be installed in the user area on Hovenweep. Miniconda may work, but has not been tested with this workflow.
Users will need to create and activate a conda environment using the [wrfhydro_huc12_agg.yml](https://github.com/hytest-org/hytest/blob/main/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/02_Spatial_Aggregation/wrfhydro_huc12_agg.yml) file to run the python script and notebooks. For this environment to work, the latest version of Miniforge should be installed in the user area on Hovenweep. Miniconda may work, but has not been tested with this workflow.

#### Ensure Miniforge is installed
```
Expand Down Expand Up @@ -58,13 +58,13 @@ Since this portion of the workflow utilizes Dask, it is important that the corre

## Instructions
### 1. Set-up
Confirm that the [usgs_common.py](wrfhydro_huc12_agg.yml) python script has the correct paths to the WRF-Hydro modeling application output static files under the "Domain Files" section. The paths currently are set up to point to the HyTEST directory on hovenweep where the 3-year subset of the data is stored. This script has multiple functions that are called into the 1-D and 2-D aggregation jupyter notebooks.
Confirm that the [usgs_common.py](https://github.com/hytest-org/hytest/blob/main/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/02_Spatial_Aggregation/usgs_common.py) python script has the correct paths to the WRF-Hydro modeling application output static files under the "Domain Files" section. The paths currently are set up to point to the HyTEST directory on hovenweep where the 3-year subset of the data is stored. This script has multiple functions that are called into the 1-D and 2-D aggregation jupyter notebooks.

### 2. 2-D Aggregation
The [2-Dimensional Aggregation jupyter notebook](01_2D_spatial_aggregation.ipynb) aggregates the 2-Dimensional WRF-Hydro modeling application outputs LDASOUT (monthly outputs named water_YYYYMM.nc) and LDASIN (monthly outputs named clim_YYYYMM.nc) to HUC12 basins, using the 1000 m grid file. The file paths for the LDASOUT and LDASIN monthly data, the 1000 m HUC12 grid file, and the location for the 2D aggregated outputs to be stored will need to be specified. This script will spin up a dask cluster to parallelize the aggregation, a link to the dask dashboard is provided to monitor workers during calculations. Once this script has finished processing, the dask cluster will need to be spun down and closed. The product from this script will be 1 netCDF file containing the spatially aggregated outputs of the 2-Dimensional WRF-Hydro monthly modeling application outputs for the years 2011-2013.
The [2-Dimensional Aggregation jupyter notebook](https://github.com/hytest-org/hytest/blob/main/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/02_Spatial_Aggregation/01_2D_spatial_aggregation.ipynb) aggregates the 2-Dimensional WRF-Hydro modeling application outputs LDASOUT (monthly outputs named water_YYYYMM.nc) and LDASIN (monthly outputs named clim_YYYYMM.nc) to HUC12 basins, using the 1000 m grid file. The file paths for the LDASOUT and LDASIN monthly data, the 1000 m HUC12 grid file, and the location for the 2D aggregated outputs to be stored will need to be specified. This script will spin up a dask cluster to parallelize the aggregation, a link to the dask dashboard is provided to monitor workers during calculations. Once this script has finished processing, the dask cluster will need to be spun down and closed. The product from this script will be 1 netCDF file containing the spatially aggregated outputs of the 2-Dimensional WRF-Hydro monthly modeling application outputs for the years 2011-2013.

### 3. 1-D Aggregation
The [1-Dimensional Aggregation jupyter notebook](02_1D_spatial_aggregation.ipynb) aggregates the 1-Dimensional WRF-Hydro modeling application outputs GWOUT (monthly outputs named gw_YYYYMM.nc) and CHRTOUT (monthly outputs named chrtout_YYYYMM.nc) to HUC12 basins, using the crosswalk csv file. The file paths for the GWOUT and CHRTOUT monthly data, the HUC12 crosswalk file, and the location for the 1D aggregated outputs to be stored will need to be specified. The product from this script will be 1 netCDF file containing the spatially aggregated outputs of the 1-Dimensional WRF-Hydro monthly modeling application outputs for the years 2011-2013.
The [1-Dimensional Aggregation jupyter notebook](https://github.com/hytest-org/hytest/blob/main/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/02_Spatial_Aggregation/02_1D_spatial_aggregation.ipynb) aggregates the 1-Dimensional WRF-Hydro modeling application outputs GWOUT (monthly outputs named gw_YYYYMM.nc) and CHRTOUT (monthly outputs named chrtout_YYYYMM.nc) to HUC12 basins, using the crosswalk csv file. The file paths for the GWOUT and CHRTOUT monthly data, the HUC12 crosswalk file, and the location for the 1D aggregated outputs to be stored will need to be specified. The product from this script will be 1 netCDF file containing the spatially aggregated outputs of the 1-Dimensional WRF-Hydro monthly modeling application outputs for the years 2011-2013.

## Variable Table
<table>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ dependencies:
- shapely
- gdal=3.5.3=py311hadb6153_11
- fiona
- s3fs



Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,11 @@
"source": [
"# Input files\n",
"#Paths for 2D and 1D aggregated files\n",
"in_file1 = r'/path/to/outputs/agg_out/CONUS_HUC12_WB_2D_19791001_20220930_2.nc'\n",
"in_file2 = r'/path/to/outputs/CONUS_HUC12_WB_1D_19791001_20220930.nc'\n",
"in_file1 = r'/path/to/outputs/agg_out/CONUS_HUC12_2D_WY2011_2013.nc'\n",
"in_file2 = r'/path/to/outputs/CONUS_HUC12_1D_WY2011_2013.nc'\n",
"\n",
"# Output file\n",
"out_file = r'/path/to/outputs/agg_out/CONUS_HUC12_WB_combined_19791001_20220930.nc'\n",
"out_file = r'/path/to/outputs/agg_out/CONUS_HUC12_WB_combined_WY2011_2013.nc'\n",
"\n",
"# Name the zone coordinate that contains the HUC12 IDs\n",
"zone_name = 'WBDHU12'\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
"metadata": {},
"outputs": [],
"source": [
"in_nc = r'/path/to/outputs/agg_out/CONUS_HUC12_WB_combined_19791001_20220930.nc'\n",
"in_nc = r'/path/to/outputs/agg_out/CONUS_HUC12_WB_combined_WY2011_2013.nc'\n",
"\n",
"# Output directory\n",
"outDir = r'/path/to/outputs/agg_out/'\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,19 @@ Tracking computation times for a 3-year subset of WRF-Hydro modeling application

| **Script** | **Description** | **Datasets processed** | **Dask** | **Completion Time** | **Output** |
| ------ | ------ | ------ | ------ | ------ | ------ |
| 01_Merge_1D_and_2D_files | Combine 1-Dimensional and 2-Dimensional aggregations into one netcdf file | CONUS_HUC12_2D_20111001_20120930.nc & CONUS_HUC12_1D_2011001_20120930.nc | No | 10 min | CONUS_HUC12_WB_combined_19791001_20220930.nc |
| 02_Format | Formatting | CONUS_HUC12_WB_combined_19791001_20220930.nc | No | 10 min | huc12_monthly_wb_iwaa_wrfhydro_WY2011_2013.nc |
| 01_Merge_1D_and_2D_files | Combine 1-Dimensional and 2-Dimensional aggregations into one netcdf file | CONUS_HUC12_2D_WY2011_2013.nc & CONUS_HUC12_1D_WY2011_2013.nc | No | 10 min | CONUS_HUC12_WB_combined_WY2011_2013.nc |
| 02_Format | Formatting | CONUS_HUC12_WB_combined_WY2011_2013.nc | No | 10 min | huc12_monthly_wb_iwaa_wrfhydro_WY2011_2013.nc |

## Compute Environment Needs
Users will need to create and activate a conda environment using the [wrfhydro_huc12_agg.yml](02_Spatial_Aggregation/wrfhydro_huc12_agg.yml) file to run the python script and notebooks. For this environment to work, the latest version of Miniforge should be installed in the user area on Hovenweep. Miniconda may work, but has not been tested with this workflow. See the README documentation in the [Spatial Aggregation](02_Spatial_Aggregation/) folder for first time environment set up instructions.
Users will need to create and activate a conda environment using the [wrfhydro_huc12_agg.yml](https://github.com/hytest-org/hytest/blob/main/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/02_Spatial_Aggregation/wrfhydro_huc12_agg.yml) file to run the python script and notebooks. For this environment to work, the latest version of Miniforge should be installed in the user area on Hovenweep. Miniconda may work, but has not been tested with this workflow. See the README documentation in the [Spatial Aggregation](https://github.com/hytest-org/hytest/tree/main/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/02_Spatial_Aggregation) folder for first time environment set up instructions.

## Instructions

### 1. Merge
The [Merge 1-D and 2-D jupyter notebook](01_Merge_1D_and_2D_files.ipynb) combines the spatially aggregated outputs of the monthly 1-Dimensional & 2-Dimensional WRF-Hydro modeling application outputs into 1 netCDF file. This script also contains plots that allow the user to explore the range in values for each variable.
The [Merge 1-D and 2-D jupyter notebook](https://github.com/hytest-org/hytest/blob/main/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/03_Finalize/01_Merge_1D_and_2D_files.ipynb) combines the spatially aggregated outputs of the monthly 1-Dimensional & 2-Dimensional WRF-Hydro modeling application outputs into 1 netCDF file. This script also contains plots that allow the user to explore the range in values for each variable.

### 2. Finalize
The [Format jupyter notebook](02_Format.ipynb) takes the merged output from step 1 and clarifies variable names, adds character HUCID's, and modifies data types. A 'yrmo' variable is added as a place for year/month information to be stored and to provide an efficient way for R users to access the final datasets. The output from this script is 1 netCDF file containing the monthly WRF-Hydro modeling application outputs aggregated to HUC12s for the years 2011-2013 that is comparable to the netCDF stored on this [Science Base](https://www.sciencebase.gov/catalog/item/6411fd40d34eb496d1cdc99d) page where the original outputs of this workflow are stored.
The [Format jupyter notebook](https://github.com/hytest-org/hytest/blob/main/dataset_processing/tutorials/niwaa_wrfhydro_monthly_huc12_agg/03_Finalize/02_Format.ipynb) takes the merged output from step 1 and clarifies variable names, adds character HUCID's, and modifies data types. A 'yrmo' variable is added as a place for year/month information to be stored and to provide an efficient way for R users to access the final datasets. The output from this script is 1 netCDF file containing the monthly WRF-Hydro modeling application outputs aggregated to HUC12s for the years 2011-2013 that is comparable to the netCDF stored on this [Science Base](https://www.sciencebase.gov/catalog/item/6411fd40d34eb496d1cdc99d) page where the original outputs of this workflow are stored.

## Variable Table
<table>
Expand Down
Loading