Grade = 18,33/20
| Average | Median | Q75 |
|---|---|---|
| 14.8 | 15.05 | 16.57 |
Best project in the course
This paper gives an insight of how can we predict the taxes that the residents should pay on a found planet based on their information. The biggest challenge of the whole process is to handle the dataset reasonably size. Models from sklearn such as Decision Trees, Random Forest, Gradient Boosting or Neural Networks were used in order to create a solid predictive model. Gradient Boosting revealed to be the best model with 87% accuracy on train and 86.6% on validation. There were analyzed different approaches for each model to guarantee that all the options were explored but the number of categorical values presented in the dataset may have been limiting due to time cost and model acceptability to treat them.
In order to reproduce the project you can finde in the folder data the dataset used, and in another folder the project description. Be sure to also dowload the requirements.txt file. Set the paths to import the data of the notebook as the one on your device.