Skip to content

ravi-kumar-chinta/Customer-Churn-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Customer Churn Prediction using Machine Learning

Python Scikit-learn NumPy Pandas XGBoost Matplotlib Seaborn Imbalanced-learn


🚀 Introduction

This project builds a machine learning model to predict whether a customer will Churn or Not Churn based on demographic, account, and service usage data.

Project Highlights:

  • Predicts customer churn using Random Forest.
  • Handles class imbalance using SMOTE.
  • Saves trained model, encoders, and scaler for making predictions on new customers.
  • Tools: Python, Scikit-learn, NumPy, Pandas, Imbalanced-learn.

📊 Dataset

  • Rows: ~7,000+ (telecom customer dataset)
  • Features: Demographics, account info, service subscriptions, charges, tenure, contract type, payment method, etc.
  • Label: Churn (1 = Churn, 0 = Not Churn)

⚙️ Installation & Setup

  1. Clone the repository:
git clone https://github.com/yourusername/customer-churn-prediction.git
  1. Install required libraries:
pip install numpy pandas scikit-learn imbalanced-learn matplotlib seaborn
  1. Run the notebook:
  • Open Customer_Churn_Prediction.ipynb in Jupyter Notebook or Google Colab and run all cells.

🛠️ Project Workflow

  1. Data Loading: Load the dataset into a pandas DataFrame.

  2. Data Preprocessing: Clean data, handle missing values, encode categorical variables, scale numeric features.

  3. Exploratory Data Analysis: Visualize distributions and correlations.

  4. Class Imbalance Handling: Use SMOTE to balance the dataset.

  5. Train-Test Split: Split data into training and testing sets.

  6. Model Training: Train Random Forest and XGBoost models with hyperparameter tuning.

  7. Evaluation: Evaluate using accuracy, ROC-AUC, confusion matrix, and classification report.

  8. Prediction Function: Predict churn for new customers using the saved model.


📈 Results

Metric Score
Training Accuracy ~85%
Test Accuracy ~82%
ROC-AUC Score ~0.85
  • The model demonstrates reliable performance in predicting customer churn.

📋 Example Input & Prediction

Input:

example_input = {
    'gender': 'Male',
    'SeniorCitizen': 1,
    'Partner': 'No',
    'Dependents': 'No',
    'tenure': 1,
    'PhoneService': 'Yes',
    'MultipleLines': 'Yes',
    'InternetService': 'Fiber optic',
    'OnlineSecurity': 'No',
    'OnlineBackup': 'No',
    'DeviceProtection': 'No',
    'TechSupport': 'No',
    'StreamingTV': 'Yes',
    'StreamingMovies': 'Yes',                                    
    'Contract': 'Month-to-month',
    'PaperlessBilling': 'Yes',
    'PaymentMethod': 'Electronic check',
    'MonthlyCharges': 99.65,
    'TotalCharges': 99.65
}

Prediction:

prediction, prob = make_prediction(example_input)
print(f"Prediction: {prediction}, Probability: {prob:.2f}")

⚡Prediction Output

  • Prediction: Churn, Probability : 0.97

📖 Try it on Colab

You can explore and run this project in Google Colab:
Open Customer Churn Prediction Notebook


✅ Conclusion

  • Built and trained a Random Forest model to predict customer churn.
  • Achieved high accuracy on both training and test sets.
  • Demonstrates a complete machine learning workflow: preprocessing, model training, evaluation, and prediction.
  • Helps businesses proactively identify at-risk customers and take retention actions.

About

Customer Churn Prediction using Random Forest and XGBoost to proactively identify at-risk customers and improve retention.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published