Zachary Bowyer
Marie Hoeger
Frances Scott-Weis
Valentina Staneva
Nicoleta Cristea
Benjamin Bright
Capstone poster presentation: https://github.com/DSHydro/Insect_Forest_Infestation/blob/main/MSDS_Capstone_Poster_Final.pptx
This project was sponsored by the UW eScience institute.
The main goal of this work was to explore the viability of
using Planet Satellite imagery to identify insect infestations in forests.
The purpose of this repository is to explore research avenues proposed by the University of Washington's escience institute.
Specifically, work is aimed at exploring the viability of identifying insect infestations in forests using satellite imagery from Planet.
Insect infestations are a leading cause of tree mortality. The impacts of increasing tree mortality are widespread and can be extremely harmful for forests and the greater environment. Current identification methods involve aerial and manual surveys and are very time intensive. This restricts the frequency at which these surveys can occur, leading to many unidentified outbreaks.
Specifically, this repository contains code used to pull data from sources such as google earth engine, Planet, etc.
Additionally, code for training a random forest model exists in this repository on landcover + planet data + hand labeled red trees.
- Get USFS polygons that range between 2018-2022 of Western Balsam Bark Beetles and have a damage designation of 'severe', 'moderate', or 'very severe'.
- Of those USFS polygons, get hand labeled data of red trees at the tree level using 2018-2023 google earth imagery.
- Using the polygon coordinates of the red tree labels, get planet RGB basemaps of the areas.
- Using the polygon coordinates of the red tree labeles, get a landcover dataset of the areas.
- Using the polygon coordinates of the red tree labels, get planet RGB/NIR composites of the areas for 2019-2022.
- Calculate spectral indices of composite files (NDVI, GNDVI, RGI).
- Combined the basemaps + annotations + landcover datasets, and make sure they are aligned
- Train a random forest model on composite data.
- CNN
- Once red tree model is sufficient, backtrack red trees areas to identify other stages of infestation such as gray/green shift
- Time series model, instead of individual images
- Improve modular experiment design
- Add test suite and continuous integration
- Automatic upload/download of data to/from shared Google drive or server
- Tool to create large area over-time visualization/heatmap of our trained model's results
- Shell script to automatically initialize project (install conda and download data from drive/planet/GEE)
Reproducing this work from scratch will require a few steps. In order, these steps can be defined as:
- Clone repository into local folder
- Request Planet API key
- Create Google Earth Engine account
- Install anaconda
- Activate environment from environment.yml (Windows environment)
- Download data folders from google drive or DSHYDRO server
- Run code
This is a typical setup step, just make sure whatever storage medium you are cloning to has enough storage to hold datasets. (~100gb)
git clone https://github.com/DSHydro/Insect_Forest_Infestation.git - HTTP
git clone git@github.com:DSHydro/Insect_Forest_Infestation.git - SSH
IMPORTANT - If are you planning to non-commerical planet data (free plans),
make sure you repeat this process many times so that you can get as many
API keys as possible, as the download limit for these keys are quite small.
For our work, we ended up with a total of six api keys by the end, and had reduced
efficiency at times because we had hit quota limit.
Applying to data access programs for ‘Planet’ data. Planet data is essentially daily/weekly/monthly/quarterly satellite imagery data. Access should take roughly one to two weeks. Https://www.planet.com/get-started/ lists ways to get data access, however it seems our use case only applies to the “Science and Education/Education and Research Program.” There are three tiers of Education and Research plans. This section only refers to the free plan for now.
Steps to apply to get access to planet data (Free research plan):
- Navigate to https://www.planet.com/markets/education-and-research/
- Click the ‘apply now’ button on the top of the page
- Fill out the information in the ‘Apply for a Basic Account’ section
- Specific Information input fields used (We did not get denied with these answers for 5+ keys):
A. Please provide a link to online content related to project (e.g. a past manuscript, project or web page), or if you don't have one, simply say "not available"
“not available”https://www.washington.edu/datasciencemasters/C. Please provide a link to more on your background (e.g. researchgate profile, LinkedIn profile), or if you don't have one, simply say "not available
https://www.linkedin.com/in/zachary-bowyer-834a80164/Graduate studentE. Describe the project you intend to investigate with Planet data. What questions do you hope to answer?
Identifying forest insect infestation with temporal geospatial dataResearchG. Describe the geography you plan to investigate (you'll have download access to up to 5,000 square kilometers of data per month)
Forests/mountainous regionsOtherOther
Once you’ve applied for access you should get an email in a few days from Planet saying to activate your account.
Go to that email and click on the ‘unique profile link’.
From there, fill out the form information.
In my case, I said the imagery usage level was ‘Beginner’ and using the data for ‘Scientific research at a University”.
After you activate your account you are able to login to your planet account.
Example Timeline:
Applied 10/21/2023 at 8:28PM
Planet activation email received 10/26/2023 at 4:22AM
- After you receive your API key, put it in Credentials/Credentials.txt on the first line. (This is where subsequent code will look for your API key for authentication)
Go to https://signup.earthengine.google.com/
Use your university affiliated email
Go through the steps.
- Click register a noncommerical or commerical cloud project
- Click "unpaid usage"
- Select project type as "Academia and research"
- Click "Create a new Google cloud project"
- Select "uw.edu" as organization
- Select project-id as whatever you want
- Set project name as whatever you want
- Click "Continue to summary"
- Click "Confirm"
From here, everything should be good to go, one thing is that
you will have to locally authenticate your google earth engine project
with your machine. This is done by calling ee.Authenticate. This will open up a
browser for you to log in with, so there's no need to store security information
locally in a file. You can test this with the first few cells of
https://github.com/DSHydro/Insect_Forest_Infestation/blob/main/Scratch_work/GoogleEarthEngineAPI.ipynb
https://www.anaconda.com/download
Latest version should be fine.
This is going to be the most difficult part of this. The reason for this is that
the environment file is hard coded to whatever version of windows the author used.
Our recommendation is that if you are on windows you attempt the environment recreation, and then manually fix issues as they pop up. If you are on linux, we recommend you go through the list of packages and download them manually into your own environment. It may be worthwhile to make an 'environments' folder and make an environment.yml for everyone's machine.
One common issue I ran into was missing DLLs. I ended up just uninstalling and reinstalling packages with earlier versions. It's not a well fleshed out answer but I've only had to solve this problem once so far. I'd recommend to whoever runs into this problem next to document their steps for fixing it here.
For this code to run, we're going to need three folders put in the ../Data folder The names of the folders are arbitrary but we recommend they look something like:
- ../Data/Annotations
- ../Data/LabeledData
- ../Data/UnlabeledData
NOTE: Metadata files are made manually, maybe try to automate this in the future using data from USFS geojson files.
The annotations folder will hold kml annotation files and their associated metadata.
The labeledData folder will contain composite and landcover .tif files. Additionally it needs to contain metadata files.
With that being said, head over to
https://drive.google.com/drive/folders/14xWJfO4k8uwXLkXJqv5xm_j9zA6zQExI
and download
"Final_Annotations", "LabeledData2", and "UnlabeledData1" as your three folders and put them in the ../Data folder
This data will be enough for you to pick up where we left off.
If you want to generate your own data, look into the code here:
https://github.com/DSHydro/Insect_Forest_Infestation/tree/main/Scratch_work/composite_scripts
https://github.com/DSHydro/Insect_Forest_Infestation/blob/main/Src/DownloadLandcoverFromTIF.py
https://github.com/DSHydro/Insect_Forest_Infestation/blob/main/Src/Modules/PlanetApiWrapper.py
If you intend on using other Planet products, you will need to probably modify existing code or create new functions for that.
You can now try running from the ../Src directory:
python Train_RF.py
python RunRF.py
If you can get these working then you will be able to reproduce the work.
Planet - https://www.planet.com/
RGB Basemaps:
We use planet.com as our source of satellite imagery/geospatial data.
Planet’s Visual Basemaps are 8-bit, time series mosaic
products which are optimized for visual consistency and
minimize the effects of clouds, haze, and other image
variability. They are ideal for use in visual backdrops
or machine learning to enable an understanding of change over time.
PlanetScope Visual Basemaps (Zoom Level 15 - 4.77 meter, Zoom Level 16 - 2.38
meter cell size at the equator) are generated with a proprietary "best scene on top"
algorithm which selects the highest quality imagery from Planet’s catalog over
specified time intervals, based on cloud cover and image sharpness. PlanetScope
Visual Basemaps can be purchased over custom areas of interest at a quarterly,
monthly, biweekly, or weekly cadence.
Composites:
Insert info here
USFS - https://www.fs.usda.gov/detail/r6/forest-grasslandhealth/insects-diseases/?cid=stelprd3791643
The United States Department of Agriculture (USDA) is directed by Congress
to annually report forest conditions for the United States through the
federal lands forest health protection restoration act.
The United States Forest Service (USFS), which is an
organization within the United States Department of Agriculture,
conducts annual aerial surveys to map out forest health.
In this work, estimates of insect damage are recorded, and we
utilize this data to find AOIs.
GEE Landcover - https://developers.google.com/earth-engine/datasets/catalog/USFS_GTAC_LCMS_v2022-8#description
An important aspect of this work is being able to exclude
non-forest pixels in model training and analysis.
Image segmentation, pixel clustering, and landcover datasets
are a way to do this, for nowe we are using a landcover dataset.
We chose this specific landcover dataset because its date
range falls within the same date range as our satellite imagery (2018-present).
This data has a resolution of 30 meters per pixel.
Our ground truth data. Since we cannot make field observations
nor have access to very high resolution satellite imagery,
our process was to use the USFS annual aeriel surveys to find general
insect infestation AOIs, then hand label red trees on google earth's
high-resolution aerial imagery.
Our hand labels are polygons in the CRS84 format.
Further documentation on specific files exists as a .md file in each main directory
./Archive/ - Place to put code/data that you don't need but dont want to explicitly delet
./Credentials/ - Contains .txt with one line, where that one line is a planet API key
./Data/ - Contains all the data used by the project
./Images/ - Images to be used in README.md
./Models/ - Saved model weights
./Scratch_work/ - Contains random work, proof of concepts, etc.
./Src/ - Contains polished work
Environment file: https://github.com/DSHydro/Insect_Forest_Infestation/blob/main/environment.yml (Windows)
Create environment from yml file and activate it
conda env create -f environment.yml
conda activate ForestInfestation
python -m ipykernel install --user --name=ForestInfestation
Create conda environment from scratch
conda create --name=ForestInfestation python=3.9.5
conda env export > environment.yml
Delete current conda environment
conda remove --name ForestInfestation --all




