This Project is my attempt at solving the Kaggle Home Credit Group - Credit Default Risk Challenge (https://www.kaggle.com/c/home-credit-default-risk/data)
Disclaimer: The Data used is not owned by me - it is owned by the Home Credit Group (https://www.kaggle.com/c/home-credit-default-risk/data). In addition, code/files in /Willkoersen_Guided_Notebooks do not belong to me, but to Willkoersen. I used his code for learning, reference, and guidance when coding on my own. His work and other related notebooks can be found here (https://www.kaggle.com/willkoehrsen/start-here-a-gentle-introduction)
Overview of My Code:
- credit_risk.ipynb: This notebook is the master notebook for this datascience project. Within it, I have gone through EDA, Feature Creation/Selection, Data Cleaning & Preprocessing, ML Training & Testing, Cross-Validations, and Featured Importance. This is the primary notebook to follow.
- utilities.py: The utilities.py file contains a number of utility functions that I use to reduce the amount of space take up in the credit_risk.ipynb. Some functions have been borrowed/reworked from Willkoersen's guides, others I have created on my own.
Note*: This project contains a number of large files. It is a good idea to activate git LFS to pull/push if clones are made.