Third project from Machine Learning Engineer Nanodegree Program Udacity course
This project consists in predict if an Electrocardiogram (ECG) presents some disease or not. To reach this goal, it will be compared two algorithms' answers and it will be check if it is possible to detect a healthy (or not) signal, just understanding its measures. It will be show how to use AWS Sagemaker features to do that. It contains a python notebook that has feature engeneering, training, testing and validation and it shows how to tuning XGboost hyperparameters. Also it contains and a study about choose the top 5 relevant features to both algorithms using random forest algorithm. XBoost's performance will be faced with CatBoost Classifier.
All process from getting the data file, transforming it, training, testing and creating the endpoint and use it is explain in notebook file ECGClassifier.ipynb.
The notebook also shows two ways of how to training, testing and deploy. To XGBoost it was used boto3 library and to CatBoost Classifier, it was used scikit learn framework.