This project delves into the challenging regression task of predicting crude protein weight in chicken carcasses using a variety of physical measurements. The core of this work is a rigorous comparative analysis of six distinct Neural and Evolutionary Learning (NEL) models. By applying a robust 10x10 Nested Cross-Validation methodology, we evaluate the behavior, performance, and applicability of each algorithm on a small but complex dataset, aiming to identify the most effective predictive models.
The primary objectives of this project are to:
- Implement and compare six different neural and evolutionary algorithms for a regression task.
- Develop a robust evaluation pipeline using Nested Cross-Validation to ensure unbiased performance estimation and fair model comparison.
- Perform hyperparameter tuning for each model to identify its optimal configuration.
- Analyze and interpret the results to determine the most suitable models for predicting protein content, considering both accuracy and stability.
This project was developed for the Neural and Evolutionary Learning (NEL) course as part of the Master's in Data Science and Advanced Analytics program at NOVA IMS. The work was completed during the 2nd Semester of the 2024-2025 academic year.
Dataset Note: The
sustavianfeed.xlsxdataset used for this project is private and cannot be distributed. All analyses and conclusions are presented within the project notebooks and the final report.
This project was implemented entirely in Python, leveraging a powerful stack of libraries for evolutionary computation, deep learning, and statistical analysis.
The project involved a structured approach to model development, hyperparameter tuning, and comparative evaluation.
Figure 1: Median and Interquartile Range (IQR) of Learning vs Test RMSE Across All Outer Folds for the six models.
The following six models were implemented and compared:
-
Genetic Programming (GP): Evolves tree-based symbolic expressions.
- Implementation:
slim_gsgplibrary. - Tuned Hyperparameters:
max_depth,p_xo,prob_const,tournament_size.
- Implementation:
-
Geometric Semantic Geometric Programming (GSGP): A variant of GP using geometric semantic operators.
- Implementation:
slim_gsgplibrary. - Tuned Hyperparameters:
init_depth,p_xo,prob_const,tournament_size.
- Implementation:
-
Semantic Learning algorithm with Inflate and deflate Mutations (SLIM): Another semantic GP variant focusing on specific mutation operators.
- Implementation:
slim_gsgplibrary. - Tuned Hyperparameters:
tournament_size,slim_versions,copy_parent.
- Implementation:
-
Neural Network (NN): A standard feedforward neural network trained with backpropagation.
- Implementation:
PyTorch. - Tuned Hyperparameters:
hidden_layers,nodes_per_layer,learning_rate,optimizer.
- Implementation:
-
NeuroEvolution of Augmenting Topologies (NEAT): Evolves both network topology and weights from minimal structures.
- Implementation:
neat-pythonlibrary. - Tuned Hyperparameters:
compatibility_threshold,node_add_prob,weight_mutate_rate.
- Implementation:
-
Neural Network optimized with Genetic Algorithm (NN&GA) (Optional Exercise): A hybrid model where a GA optimizes the weights of a fixed-architecture PyTorch NN.
- Implementation: Custom GA with PyTorch NN.
- Focus: Implementing crossover and mutation operators for NN weights.
Note: The NN&GA model was included as an extra exercise and not fully tuned due to computational constraints.
- Nested Cross-Validation (NCV): A 10x10 NCV scheme was used for robust hyperparameter tuning (inner loop) and unbiased generalization performance estimation (outer loop). The same random seed and data splits were used across all models for fair comparison.
- Performance Metric: Root Mean Squared Error (RMSE) was the primary metric, penalizing larger errors more heavily. Median RMSE from inner validation sets guided hyperparameter selection.
- Statistical Analysis: A non-parametric Friedman test followed by the Nemenyi post-hoc test was conducted on the outer fold test RMSE values to determine statistically significant performance differences between models.
The project was developed iteratively through weekly deliverables, each focusing on one of the core algorithms:
- GP: Implemented and tuned; analyzed bloat, overfitting, and premature convergence.
- GSGP: Implemented and tuned; analyzed similar characteristics.
- SLIM: Implemented and tuned; analyzed similar characteristics.
- NN: Implemented and tuned; focused on overfitting and premature convergence.
- NEAT: Implemented and tuned; focused on overfitting and premature convergence.
- NN&GA (Extra): Implemented as part of the final submission.
Figure 2: Learning and Test RMSE Across Models (10-Fold Cross-Validation).
Note: Y-axis limited for readability.
Table 1: Median and Interquartile Range (IQR) of Test RMSE for All Models Across 10 Folds.
- Best Performing Models: Genetic Programming (GP) and NEAT demonstrated superior and stable predictive performance.
- GP achieved the lowest median test RMSE (4.545).
- NEAT followed closely with a median test RMSE of 5.690 and a notably small test IQR, indicating stable performance.
- Neural Networks (NN): Proved competitive with a median test RMSE of 7.211.
- Semantic GP Variants (GSGP & SLIM): Showed higher test RMSEs and some underfitting/premature convergence.
- GSGP median test RMSE: ~10.517
- SLIM median test RMSE: ~10.618
- NN&GA: Exhibited the highest median test RMSE (19.967) and clear overfitting, likely due to limited hyperparameter tuning (due to computational cost) and the complexities of optimizing NN weights with a GA on this dataset.
- Statistical Significance: The Friedman test (p-value β 0.00) confirmed significant performance differences. The Nemenyi post-hoc test revealed that GP and NEAT significantly outperformed NN&GA, and SLIM performed significantly worse than GP and NEAT. No significant differences were found between GP, NEAT, and NN, or between GSGP and SLIM.
- Challenges: The small dataset size (96 samples) was a primary challenge, amplifying performance variability and making hyperparameter tuning very sensitive. Computational intensity, especially for NEAT and NN&GA, constrained the extent of hyperparameter exploration. Evolutionary algorithms often faced premature convergence or bloat.
- Weekly Notebooks: Submitted throughout the course, detailing the implementation and tuning of each algorithm.
- Final Report: A comprehensive report (max 4 pages + references) focusing on results, discussion, and model comparison. (The PDF provided serves as this).
- Final Code: Aggregation of all weekly codes and final analysis scripts.
This project provided a thorough comparative analysis of six Neural and Evolutionary Learning models for predicting crude protein content. GP and NEAT emerged as the most robust and accurate predictors for this specific regression task and dataset. The study highlighted the critical role of robust evaluation methodologies like NCV, especially with small datasets, and underscored the unique challenges and strengths of different NEL paradigms. Future work could involve applying these models to larger datasets, exploring more extensive hyperparameter tuning, and investigating different architectural or evolutionary strategies to further enhance performance and generalization.
For detailed implementation and results, please refer to the project notebooks and the final report. Good luck with your own NEL projects! π
[1] A Vanneschi, L., & Silva, S. (2023). Lectures on Intelligent Systems. Springer Nature.
[2] Vanneschi, L. (2024). SLIM_GSGP: The Non-bloating Geometric Semantic Genetic Programming. Lecture Notes in Computer Science, 125β141. https://doi.org/10.1007/978-3-031-56957-9_8
[3] Stanley, K. O., & Miikkulainen, R. (2002). Evolving Neural Networks through Augmenting Topologies. Evolutionary Computation, 10(2), 99β127. https://doi.org/10.1162/106365602320169811
[4] Rainio, O., Teuho, J., & KlΓ©n, R. (2024). Evaluation metrics and statistical tests for machine learning. Scientific Reports, 14(1), 1β14. Nature. https://doi.org/10.1038/s41598-024-56706-x
- AndrΓ© Silvestre (20240502)
- Filipa Pereira (20240509)
- Umeima Mahomed (20240543)


