Human intervention in the decline of butterfly species populations may be necessary in the near future. The automation of butterfly identification and classification can aid in scientific research and biodiversity conservation workflows.
Two CNN models were trained and analyzed in this project, with the ResNet50 (96.69% training accuracy, 91.77% validation accuracy; 0.92 precision, 0.91 recall, 0.91 f1-score, 1300 support) improving upon the custom SimpleButterflyCNN model's (69.34% training accuracy, 72.38% validation accuracy) classification results.
Read the Project Report or view the Project Presentations I (6-7 min) &/or II (10-15 min) for a comprehensive overview of the project.
- Python 3.13.1, Jupyter Notebooks
- NumPy
- Pandas
- Pillow (PIL)
- PyTorch and Torchvision
- TQDM
- Scikit-learn
- Matplotlib
- Seaborn
pip install numpy pandas pillow torch torchvision tqdm scikit-learn matplotlib seaborn
-
Butterfly Image Classification Dataset (Version 2): https://www.kaggle.com/datasets/phucthaiv02/butterfly-image-classification/data
-
Final_ML_Butterfly_Classification.ipynb -- to preprocess, train CNN model & run analysis
Expected file structure (after downloading dataset):
project
|
|-- butterfly-image-classification.zip
|-- butterfly-image-classification/
|---- test/
|------ Image_1.jpg
|------ ...
|---- train/
|------ Image_1.jpg
|------ ...
|---- Testing_set.csv
|---- Training_set.csv
|-- Final_ML_Butterfly_Classification.ipynb
|
Run All: Final_ML_Butterfly_Classification.ipynb
Expected file structure (after running code):
project
|
|-- butterfly-image-classification.zip
|-- butterfly-image-classification/
|---- test/
|------ Image_1.jpg
|------ ...
|---- train/
|------ ADONIS/
|-------- Image_2.jpg
|-------- ...
|------ ...
|---- val/
|------ ADONIS/
|-------- Image_2.jpg
|-------- ...
|------ ...
|---- Testing_set.csv
|---- Training_set.csv
|-- Final_ML_Butterfly_Classification.ipynb
|-- butterfly_cnn.pth
|