Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
a96726d
add global height level dataset
mo-jeff Nov 4, 2025
5588033
add remaining dataset documentation for global and uk collections
mo-jeff Nov 18, 2025
acba9ac
Update datasets/met-office/collection/met-office-global-deterministic…
mo-jeff Nov 19, 2025
21b9fbb
Update datasets/met-office/collection/met-office-uk-deterministic-hei…
mo-jeff Nov 19, 2025
851c4ba
add template docs and description markdown for UKV Near Surface Colle…
mo-jeff Dec 3, 2025
aa54014
Merge branch 'met-office-datasets' of https://github.com/mo-jeff/plan…
mo-jeff Dec 3, 2025
dbb235a
Update datasets/met-office/collection/met-office-uk-deterministic-nea…
mo-jeff Dec 4, 2025
ebe9b91
Update datasets/met-office/collection/met-office-uk-deterministic-nea…
mo-jeff Dec 4, 2025
6de0113
finish UKV and add Global Near Surface
mo-jeff Dec 4, 2025
53300c6
Merge branch 'met-office-datasets' of https://github.com/mo-jeff/plan…
mo-jeff Dec 4, 2025
325513f
refactor UK Near Surface to use datacube STAC extension & update mode…
mo-jeff Dec 8, 2025
20691cf
update Global Near Surface with datacube extension & update description
mo-jeff Dec 8, 2025
c458572
tidy up Update Frequency docs for UK Collection
mo-jeff Dec 8, 2025
fa8446c
add Global collection documentation markdown
mo-jeff Dec 8, 2025
0fd17ad
convert all Global collections to use datacube extension and add Glob…
mo-jeff Dec 8, 2025
39ceaa9
update global height level to refactored STAC spec
mo-jeff Dec 12, 2025
d913837
update keyword for Global Height collection
mo-jeff Dec 12, 2025
814bb2e
feat: add ingest code and workflow
gadomski Dec 15, 2025
23b6c1b
Merge branch 'main' into met-office-datasets
gadomski Dec 15, 2025
cbcb376
refactor: remove `-level` from folders
gadomski Dec 15, 2025
6e705af
updated uk and global height collections
mo-jeff Dec 16, 2025
fc12847
typo in uk height collection
mo-jeff Dec 16, 2025
2040533
add CF standard name to UK Height collection assets
mo-jeff Dec 16, 2025
cbf7d30
refactor global pressure collection
mo-jeff Dec 16, 2025
2176cbd
refactor UK Pressure Collection
mo-jeff Dec 16, 2025
949116d
refactor UK and Gloal Pressure and Whole Atmosphere collections
mo-jeff Dec 16, 2025
611347a
refactor near surface collections for UK and Global
mo-jeff Dec 17, 2025
aefb3f8
fix: move datacube to the collection level
gadomski Dec 18, 2025
f5b47bb
chore: add uk met office test
gadomski Jan 6, 2026
ef9ce87
Merge branch 'main' into met-office-datasets
gadomski Jan 6, 2026
1168a40
fix: tests now work on real files
gadomski Jan 6, 2026
4ca2177
fix: update citation
gadomski Jan 6, 2026
557b5ac
docs: add comment the cleanup script
gadomski Jan 6, 2026
57a1615
fix: update collection extents
gadomski Jan 6, 2026
fbfb03f
Ingestion fixes for production release
ghidalgo3 Jan 6, 2026
aa91a3d
deps: bump stactools to v0.4.0
gadomski Jan 7, 2026
e772067
Merge branch 'main' into met-office-datasets
gadomski Jan 7, 2026
7bcaf7c
fix: splits
gadomski Jan 7, 2026
492cd03
feat: add data access section
gadomski Jan 13, 2026
fd160fc
fix: urls
gadomski Jan 13, 2026
555ff68
fix: pressure
gadomski Jan 13, 2026
326083c
fix: add accidentally-removed description
gadomski Jan 13, 2026
e524c3b
fix: add note about data availability
gadomski Jan 13, 2026
3cb1fec
fix: remove notice and data access header
gadomski Jan 14, 2026
5c4b93f
script fixes
ghidalgo3 Jan 14, 2026
3e86de9
Upgrade all ingestion images
ghidalgo3 Jan 14, 2026
863f573
Merge branch 'met-office-datasets' of https://github.com/mo-jeff/plan…
ghidalgo3 Jan 14, 2026
b9b8230
update link to MPC Met Office collections in descriptions
mo-jeff Jan 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 16 additions & 10 deletions datasets/ecmwf-forecast/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 10
# See https://github.com/mapbox/rasterio/issues/1289
ENV CURL_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt

# Install Python 3.10
RUN curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh" \
&& bash "Mambaforge-$(uname)-$(uname -m).sh" -b -p /opt/conda \
&& rm -rf "Mambaforge-$(uname)-$(uname -m).sh"
# Install Python 3.11
RUN curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" \
&& bash "Miniforge3-$(uname)-$(uname -m).sh" -b -p /opt/conda \
&& rm -rf "Miniforge3-$(uname)-$(uname -m).sh"

ENV PATH /opt/conda/bin:$PATH
ENV LD_LIBRARY_PATH /opt/conda/lib/:$LD_LIBRARY_PATH
Expand All @@ -43,27 +43,33 @@ RUN python -m pip install --no-build-isolation -r /tmp/requirements.txt

COPY pctasks/core /opt/src/pctasks/core
RUN cd /opt/src/pctasks/core && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/cli /opt/src/pctasks/cli
RUN cd /opt/src/pctasks/cli && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/task /opt/src/pctasks/task
RUN cd /opt/src/pctasks/task && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/client /opt/src/pctasks/client
RUN cd /opt/src/pctasks/client && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/ingest /opt/src/pctasks/ingest
RUN cd /opt/src/pctasks/ingest && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/dataset /opt/src/pctasks/dataset
RUN cd /opt/src/pctasks/dataset && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY ./datasets/ecmwf-forecast/requirements.txt /opt/src/datasets/ecmwf-forecast/requirements.txt
RUN python3 -m pip install -r /opt/src/datasets/ecmwf-forecast/requirements.txt
Expand Down
4 changes: 2 additions & 2 deletions datasets/ecmwf-forecast/dataset.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
id: ecmwf_forecast
image: ${{ args.registry }}/pctasks-ecmwf-forecast:2024.6.13.0
image: ${{ args.registry }}/pctasks-ecmwf-forecast:2026.01.12

args:
- registry
- registry

code:
src: ${{ local.path(./ecmwf_forecast.py) }}
Expand Down
26 changes: 16 additions & 10 deletions datasets/gbif/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,14 @@ RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 10
ENV CURL_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt

# Install Python 3.8
RUN curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh" \
&& bash "Mambaforge-$(uname)-$(uname -m).sh" -b -p /opt/conda \
&& rm -rf "Mambaforge-$(uname)-$(uname -m).sh"
RUN curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" \
&& bash "Miniforge3-$(uname)-$(uname -m).sh" -b -p /opt/conda \
&& rm -rf "Miniforge3-$(uname)-$(uname -m).sh"

ENV PATH /opt/conda/bin:$PATH
ENV LD_LIBRARY_PATH /opt/conda/lib/:$LD_LIBRARY_PATH

RUN mamba install -y -c conda-forge python=3.8 gdal=3.3.3 pip setuptools cython numpy==1.21.5
RUN mamba install -y -c conda-forge python=3.11 gdal pip setuptools cython numpy

RUN python -m pip install --upgrade pip

Expand All @@ -43,27 +43,33 @@ RUN python -m pip install --no-build-isolation -r /tmp/requirements.txt

COPY pctasks/core /opt/src/pctasks/core
RUN cd /opt/src/pctasks/core && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/cli /opt/src/pctasks/cli
RUN cd /opt/src/pctasks/cli && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/task /opt/src/pctasks/task
RUN cd /opt/src/pctasks/task && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/client /opt/src/pctasks/client
RUN cd /opt/src/pctasks/client && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/ingest /opt/src/pctasks/ingest
RUN cd /opt/src/pctasks/ingest && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/dataset /opt/src/pctasks/dataset
RUN cd /opt/src/pctasks/dataset && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY ./datasets/gbif/requirements.txt /opt/src/datasets/gbif/requirements.txt
RUN python3 -m pip install -r /opt/src/datasets/gbif/requirements.txt
Expand Down
6 changes: 3 additions & 3 deletions datasets/gbif/dataset.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
id: gbif
image: ${{ args.registry }}/pctasks-gbif:20230607.1
image: ${{ args.registry }}/pctasks-gbif:2026.01.12

args:
- registry
- registry

code:
src: ${{ local.path(./gbif.py) }}
Expand All @@ -24,4 +24,4 @@ collections:
min_depth: 2
max_depth: 2
chunk_storage:
uri: blob://ai4edataeuwest/gbif-etl-data/pctasks-chunks
uri: blob://ai4edataeuwest/gbif-etl-data/pctasks-chunks
18 changes: 12 additions & 6 deletions datasets/goes/goes-cmi/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -43,27 +43,33 @@ RUN python -m pip install --no-build-isolation -r /tmp/requirements.txt

COPY pctasks/core /opt/src/pctasks/core
RUN cd /opt/src/pctasks/core && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/cli /opt/src/pctasks/cli
RUN cd /opt/src/pctasks/cli && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/task /opt/src/pctasks/task
RUN cd /opt/src/pctasks/task && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/client /opt/src/pctasks/client
RUN cd /opt/src/pctasks/client && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/ingest /opt/src/pctasks/ingest
RUN cd /opt/src/pctasks/ingest && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/dataset /opt/src/pctasks/dataset
RUN cd /opt/src/pctasks/dataset && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY datasets/goes/goes-cmi/requirements.txt /opt/src/datasets/goes/goes-cmi/requirements.txt
RUN python3 -m pip install -r /opt/src/datasets/goes/goes-cmi/requirements.txt
Expand Down
46 changes: 23 additions & 23 deletions datasets/goes/goes-cmi/dataset.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
id: goes_cmi
image: ${{ args.registry }}/pctasks-goes-cmi:2025.04.15.1
image: ${{ args.registry }}/pctasks-goes-cmi:2026.01.12

args:
- registry
- year-prefix
- registry
- year-prefix

code:
src: ${{ local.path(./goes_cmi) }}
Expand All @@ -19,20 +19,20 @@ collections:
asset_storage:
# The blob storage pattern is
# | noaa-goes16
# | ABI-L2-MCMIPC
# | ABI-L2-MCMIPC
# | 2023
# | 170 # day of year
# | 23 # hour
# | OR_ABI-L2-MCMIPC-M6_G16_s20231702301183_e20231702303555_c20231702304136.nc
- uri: blob://goeseuwest/noaa-goes16/
chunks:
splits:
- prefix: ABI-L2-MCMIPC/${{ args.year-prefix }} # CONUS
depth: 3 # daily
- prefix: ABI-L2-MCMIPM/${{ args.year-prefix }} # Mesoscale
depth: 3 # daily
- prefix: ABI-L2-MCMIPF/${{ args.year-prefix }} # Full Disk
depth: 3 # daily
- prefix: ABI-L2-MCMIPC/${{ args.year-prefix }} # CONUS
depth: 3 # daily
- prefix: ABI-L2-MCMIPM/${{ args.year-prefix }} # Mesoscale
depth: 3 # daily
- prefix: ABI-L2-MCMIPF/${{ args.year-prefix }} # Full Disk
depth: 3 # daily
options:
ends_with: .nc
# # GOES-17 is parked
Expand All @@ -50,25 +50,25 @@ collections:
- uri: blob://goeseuwest/noaa-goes18/
chunks:
splits:
- prefix: ABI-L2-MCMIPC/${{ args.year-prefix }} # CONUS
depth: 3 # daily
- prefix: ABI-L2-MCMIPM/${{ args.year-prefix }} # Mesoscale
depth: 3 # daily
- prefix: ABI-L2-MCMIPF/${{ args.year-prefix }} # Full Disk
depth: 3 # daily
- prefix: ABI-L2-MCMIPC/${{ args.year-prefix }} # CONUS
depth: 3 # daily
- prefix: ABI-L2-MCMIPM/${{ args.year-prefix }} # Mesoscale
depth: 3 # daily
- prefix: ABI-L2-MCMIPF/${{ args.year-prefix }} # Full Disk
depth: 3 # daily
options:
ends_with: .nc

- uri: blob://goeseuwest/noaa-goes19/
chunks:
splits:
- prefix: ABI-L2-MCMIPC/${{ args.year-prefix }} # CONUS
depth: 3 # daily
- prefix: ABI-L2-MCMIPM/${{ args.year-prefix }} # Mesoscale
depth: 3 # daily
- prefix: ABI-L2-MCMIPF/${{ args.year-prefix }} # Full Disk
depth: 3 # daily
- prefix: ABI-L2-MCMIPC/${{ args.year-prefix }} # CONUS
depth: 3 # daily
- prefix: ABI-L2-MCMIPM/${{ args.year-prefix }} # Mesoscale
depth: 3 # daily
- prefix: ABI-L2-MCMIPF/${{ args.year-prefix }} # Full Disk
depth: 3 # daily
options:
ends_with: .nc
chunk_storage:
uri: blob://goeseuwest/noaa-goes-etl-data/pctasks/cmi/
uri: blob://goeseuwest/noaa-goes-etl-data/pctasks/cmi/
18 changes: 12 additions & 6 deletions datasets/goes/goes-glm/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -43,27 +43,33 @@ RUN python -m pip install --no-build-isolation -r /tmp/requirements.txt

COPY pctasks/core /opt/src/pctasks/core
RUN cd /opt/src/pctasks/core && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/cli /opt/src/pctasks/cli
RUN cd /opt/src/pctasks/cli && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/task /opt/src/pctasks/task
RUN cd /opt/src/pctasks/task && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/client /opt/src/pctasks/client
RUN cd /opt/src/pctasks/client && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/ingest /opt/src/pctasks/ingest
RUN cd /opt/src/pctasks/ingest && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY pctasks/dataset /opt/src/pctasks/dataset
RUN cd /opt/src/pctasks/dataset && \
pip install .
pip install -r requirements.txt && \
pip install --no-deps .

COPY datasets/goes/goes-glm /opt/src/datasets/goes-glm
RUN python3 -m pip install -r /opt/src/datasets/goes-glm/requirements.txt
Expand Down
8 changes: 4 additions & 4 deletions datasets/goes/goes-glm/dataset.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
id: goes_glm
image: ${{ args.registry }}/pctasks-goes-glm:2025.04.16.0
image: ${{ args.registry }}/pctasks-goes-glm:2026.01.12

args:
- registry
- year-prefix
- registry
- year-prefix

code:
src: ${{ local.path(./goes_glm.py) }}
Expand Down Expand Up @@ -60,4 +60,4 @@ collections:
depth: 2

chunk_storage:
uri: blob://goeseuwest/noaa-goes-etl-data/pctasks/glm/
uri: blob://goeseuwest/noaa-goes-etl-data/pctasks/glm/
5 changes: 2 additions & 3 deletions datasets/hls2/dataset.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
id: hls2
image: ${{ args.registry }}/pctasks-task-base:2025.4.8.2
image: ${{ args.registry }}/pctasks-task-base:2026.01.12

args:
- registry
Expand Down Expand Up @@ -42,7 +42,6 @@ collections:
chunk_length: 20000
chunk_storage:
uri: blob://hls2euwest/hls2-l30-info/pctasks-chunks/

# The blob storage pattern is:
# | container
# | S30/ or L30/ depending on collection
Expand All @@ -57,4 +56,4 @@ collections:
# | HLS.S30.T56PPQ.2024005T001421.v2.0.B01.tif
# | HLS.S30.T56PPQ.2024005T001421.v2.0.B02.tif
# | ...
# | eg S30/56/P/PQ/2024/01/05/HLS.S30.T56PPQ.2024005T001421.v2.0/HLS.S30.T56PPQ.2024005T001421.v2.0.B01.tif
# | eg S30/56/P/PQ/2024/01/05/HLS.S30.T56PPQ.2024005T001421.v2.0/HLS.S30.T56PPQ.2024005T001421.v2.0.B01.tif
Loading
Loading