A comprehensive machine learning system for predicting heart disease risk using patient data. This project provides multiple interfaces (web, console, and executable) with advanced visualization and explanation capabilities.
[pip install numpy==1.24.3 pandas==2.0.3 scikit-learn==1.3.0 matplotlib==3.7.2 seaborn==0.12.2 joblib==1.3.1 gradio==4.19.2 pyinstaller==6.5.0]
- Random Forest Classifier with hyperparameter optimization
- Feature Importance Analysis to identify key risk factors
- Personalized Risk Assessment with detailed explanations
- Real-time Probability Scoring (0-100% risk scale)
- Medical Context Integration with BMI calculations
- Interactive Risk Gauge with color-coded severity levels
- Feature Impact Charts showing positive/negative contributions
- Category-based Risk Analysis (Demographics, Symptoms, Vital Signs, etc.)
- Personalized Recommendations based on individual risk factors
- Professional Medical-style Reports with clean white backgrounds
- Modern, responsive web UI accessible via browser
- Real-time predictions with interactive controls
- Enhanced visualization dashboard
- Mobile-friendly design
- Automatic BMI calculation
- Command-line interface for quick predictions
- Step-by-step input guidance
- Detailed result explanations
- Perfect for automation and scripting
- PyInstaller-built Windows executable
- No Python installation required
- Portable distribution
- Complete self-contained application
- Age (20-100 years)
- Gender (Male/Female)
- Weight and Height (with automatic BMI calculation)
- Chest Pain Type (4 categories: Typical angina, Atypical angina, Non-anginal pain, Asymptomatic)
- Resting Blood Pressure (80-200 mmHg)
- Serum Cholesterol (100-600 mg/dl)
- Fasting Blood Sugar (>120 mg/dl indicator)
- Family History of Heart Disease
- Resting Electrocardiographic Results (Normal, ST-T abnormality, Left ventricular hypertrophy)
- Maximum Heart Rate Achieved (60-220 bpm)
- Exercise-Induced Angina (Yes/No)
- ST Depression (0-10 range)
- Peak Exercise ST Segment Slope (Upsloping, Flat, Downsloping)
- Number of Major Vessels (0-3) colored by fluoroscopy
- Thalassemia Test Results (Normal, Fixed defect, Reversible defect)
- Automated data preprocessing with feature scaling
- Cross-validation with GridSearchCV
- Comprehensive evaluation metrics:
- Accuracy, Precision, Recall, F1-Score
- ROC-AUC Score
- Confusion Matrix
- Classification Report
- Automatic BMI calculation from height/weight
- Feature importance ranking
- Categorical encoding (One-hot encoding)
- Numerical feature standardization
- Joblib-based model serialization
- Automatic model saving/loading
- Preprocessor state preservation
- Cross-session compatibility
- Visual feature importance plots
- Top contributing factors identification
- Category-wise risk breakdown
- Medical context explanations
- Risk-specific lifestyle suggestions
- Medical consultation recommendations
- Exercise and diet guidance
- Monitoring suggestions based on risk factors
- Low Risk (<20%): Minimal intervention needed
- Moderate Risk (20-50%): Lifestyle modifications recommended
- High Risk (50-80%): Medical consultation advised
- Very High Risk (>80%): Immediate medical attention suggested
Python 3.8+
pip (Python package manager)- Clone the repository
git clone <repository-url>
cd heart-disease-prediction- Install dependencies
pip install -r requirements.txt- Run the application
python heart_disease_app.pyAccess the web interface at http://localhost:7860
python main.pypython analyze_features.pypython analyze_features_only.py # Generate feature importance plots without UIpython simple_heart_app.py # Minimal version for basic predictionsCreate a portable Windows executable:
python build_exe.pyOr use the batch file:
build_exe.batThe executable will be created in the dist/ folder.
For Windows users, you can also use:
run_heart_disease_app.bat # Quick launcher for web interface- Launch the application:
python heart_disease_app.py - Open browser to the provided URL
- Fill in patient information using the intuitive sliders and dropdowns
- Click "Predict Heart Disease Risk"
- View comprehensive results with visualizations and recommendations
from src.ui.console_interface import ConsoleInterface
interface = ConsoleInterface('models/heart_disease_model.joblib')
interface.run()from src.models.model import HeartDiseaseModel
# Load trained model
model = HeartDiseaseModel.load('models/heart_disease_model.joblib')
# Make prediction
input_data = {
'age': 45, 'sex': 1, 'cp': 0, 'trestbps': 130,
'chol': 250, 'fbs': 0, 'restecg': 0, 'thalach': 150,
'exang': 0, 'oldpeak': 1.0, 'slope': 0, 'ca': 0, 'thal': 1,
'height': 175, 'weight': 80
}
explanation = model.explain_prediction(input_data)
print(f"Risk Level: {explanation['risk_level']}")
print(f"Probability: {explanation['probability']:.1%}")Heart Disease Prediction T/
βββ π src/ # Source code modules
β βββ π data/ # Data-related modules
β β βββ feature_definitions.py # Feature specifications & descriptions
β β βββ __init__.py
β βββ π models/ # Machine learning models
β β βββ heart_disease_model.py # Main ML model implementation
β β βββ model.py # Alternative model interface
β β βββ feature_importance.py # Feature analysis tools
β β βββ __init__.py
β βββ π preprocessing/ # Data preprocessing
β β βββ preprocessor.py # Data cleaning & transformation
β β βββ __init__.py
β βββ π ui/ # User interfaces
β β βββ gradio_interface.py # Web interface implementation
β β βββ console_interface.py # Command-line interface
β β βββ __init__.py
β βββ π utils/ # Utility functions
β β βββ data_loader.py # Data loading & sample generation
β β βββ __init__.py
β βββ __init__.py
βββ π models/ # Trained model storage
β βββ heart_disease_model.joblib # Saved model file
βββ π heart_disease_app.py # Main web application
βββ π main.py # CLI application entry point
βββ π analyze_features.py # Feature analysis script
βββ π build_exe.py # Executable builder
βββ π example.py # Usage examples
βββ π analyze_features_only.py # Feature-only analysis script
βββ π simple_heart_app.py # Simplified application version
βββ π requirements.txt # Python dependencies
βββ π run_heart_disease_app.bat # Windows launcher
βββ π build_exe.bat # Windows build script
βββ π feature_importance.png # Generated feature importance plot
βββ π *.spec # PyInstaller specification files
βββ π README.md # This file
- numpy==1.24.3 - Numerical computing
- pandas==2.0.3 - Data manipulation
- scikit-learn==1.3.0 - Machine learning algorithms
- joblib==1.3.1 - Model serialization
- matplotlib==3.7.2 - Plotting and visualization
- seaborn==0.12.2 - Statistical visualizations
- gradio==4.19.2 - Web interface framework
- pyinstaller==6.5.0 - Executable creation
- Random Forest Classifier - Ensemble learning for robust predictions
- GridSearchCV - Hyperparameter optimization
- Cross-Validation - Model validation and selection
- Feature Importance - Understanding model decisions
- One-Hot Encoding - Categorical variable handling
- Standard Scaling - Numerical feature normalization
- Missing Value Handling - Data quality assurance
- Feature Engineering - BMI calculation and derived features
- Risk Gauges - Intuitive probability display
- Feature Impact Charts - Contribution analysis
- Category Grouping - Medical domain organization
- Color Coding - Risk level visualization
- Professional Medical Theme with clean white backgrounds
- Interactive Risk Gauge with color-coded severity (Green β Red)
- Feature Impact Visualization showing positive/negative contributions
- Category-based Analysis grouping features by medical domain
- Personalized Recommendations with medical icons and actionable advice
- Real-time BMI Calculator integrated into the interface
- Medical Context Explanations for each feature and risk factor
- π Demographics - Age, Gender, BMI
- π Symptoms - Chest pain patterns
- π©Ί Vital Signs - Blood pressure, heart rate
- π§ͺ Blood Tests - Cholesterol, blood sugar
- π Heart Tests - ECG results
- π Exercise Tests - Stress test results
- π Imaging - Vessel blockages, perfusion
- πββοΈ Physical - Height, weight, derived metrics
- β Fixed matplotlib title font conflicts for better compatibility
- β Clean white backgrounds throughout all visualizations
- β Enhanced medical styling with professional appearance
- β Improved error handling and user feedback
The Random Forest model achieves:
- High Accuracy on validation datasets
- Balanced Precision/Recall for both classes
- Robust Feature Importance rankings
- Reliable Probability Estimates for risk assessment
Note: Actual performance metrics depend on the training dataset used.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Medical feature definitions based on established cardiology research
- UCI Heart Disease Dataset for reference
- Scikit-learn community for machine learning tools
- Gradio team for the excellent web interface framework
Matplotlib Font Errors
- Fixed: Title font conflicts resolved in latest version
- All backgrounds now use clean white styling
Model Not Found
- Run
python main.pyto train a new model automatically - Model will be saved to
models/heart_disease_model.joblib
Import Errors
- Ensure all dependencies are installed:
pip install -r requirements.txt - Check Python version compatibility (3.8+)
Executable Build Issues
- Use
python build_exe.pyinstead of direct PyInstaller commands - Ensure all dependencies are properly installed
- Use the web interface for best user experience
- Console interface is faster for batch predictions
- Feature analysis scripts help understand model behavior
For questions, issues, or contributions:
- Open an issue on GitHub
- Check the documentation in the
src/modules - Review the example usage in
example.py - Test with different interfaces to find what works best for your use case
- heart_disease_app.py - Full-featured web interface (Recommended)
- main.py - Command-line interface with training options
- analyze_features.py - Feature analysis with web UI
- analyze_features_only.py - Feature analysis without UI
- simple_heart_app.py - Minimal prediction interface
A comprehensive machine learning system for predicting heart disease risk with an interactive and user-friendly interface.
- π©Ί Advanced Risk Prediction: Uses machine learning to predict heart disease risk with high accuracy
- π Interactive Visualizations: Dynamic feature impact graphs showing how each factor affects risk
- π― Personalized Recommendations: Custom health suggestions based on individual risk factors
- π± User-Friendly Interface: Large text and intuitive design for easy use by all age groups
- π Real-time BMI Calculation: Instantly calculates and categorizes BMI from height and weight
- π Shareable Public Link: Access the tool from anywhere via a public URL
- π» Standalone Application: Can be packaged as an executable (.exe) for offline use
- π Educational Information: Provides medical context for each risk factor
| Factor | Description | Impact |
|---|---|---|
| π΄ Age | Risk doubles every decade after 45 | High |
| βοΈ Gender | Men have 2-3x higher risk before age 55 | Medium |
| π Chest Pain | Typical angina strongly indicates coronary artery disease | Very High |
| π©Έ Blood Pressure | Each 20mmHg increase doubles risk | High |
| π Cholesterol | 23% increased risk per 40mg/dl above 200 | High |
| π§ͺ Blood Sugar | Diabetes doubles heart disease risk | Medium |
| π ECG Results | ST-T abnormalities indicate 5x higher risk | High |
| β€οΈ Max Heart Rate | Lower max heart rate indicates decreased function | Medium |
| π£ Exercise Angina | Presence indicates 3x higher risk | High |
| π ST Depression | Values >2mm indicate severe ischemia | High |
| π ST Slope | Downsloping indicates poor prognosis | Medium |
| π©Ί Major Vessels | Risk increases 2x per vessel affected | High |
| π¬ Thalassemia | Reversible defects indicate 3x higher risk | High |
| βοΈ BMI | BMI >30 increases risk by 50% | Medium |
- Algorithm: Random Forest Classifier
- Features: 13 clinical parameters + BMI
- Metrics:
- Accuracy: ~85%
- Precision: ~84%
- Recall: ~86%
- F1 Score: ~85%
- ROC AUC: ~90%
Heart Disease Prediction/
βββ models/ # Trained model files
βββ src/ # Source code
β βββ data/ # Data definitions and processing
β βββ models/ # ML model implementation
β βββ preprocessing/ # Data preprocessing
β βββ ui/ # User interfaces
β βββ utils/ # Utility functions
βββ heart_disease_app.py # Main application
βββ build_exe.py # Executable builder
βββ main.py # CLI version
βββ requirements.txt # Dependencies
- Python 3.8+: Core programming language
- Scikit-learn: Machine learning algorithms
- Pandas/NumPy: Data processing
- Matplotlib/Seaborn: Data visualization
- Gradio: Web interface
- PyInstaller: Executable packaging
numpy>=1.19.5
pandas>=1.3.0
scikit-learn>=0.24.2
matplotlib>=3.4.2
seaborn>=0.11.1
gradio>=3.0.0
joblib>=1.0.1
pyinstaller>=5.0.0 # For executable creation
-
Clone the repository
git clone https://github.com/samyak2403/Heart-Disease-Prediction-T.git cd heart-disease-prediction -
Install dependencies
pip install -r requirements.txt -
Run the web application
python heart_disease_app.py -
Build executable (optional)
python build_exe.py
- Enter patient information in the form
- Click "SUBMIT" to generate prediction
- View results including:
- Heart disease risk prediction
- Risk level assessment
- Contributing factors
- Interactive feature impact visualization
- Personalized health recommendations
- Run the generated .exe file
- Follow the same steps as the web interface
The system provides a detailed visualization of how each factor contributes to heart disease risk:
- π Color-coded bars: Red for risk-increasing factors, green for risk-decreasing factors
- π’ Numerical impact: Precise quantification of each factor's contribution
- βΉοΈ Educational tooltips: Medical information about each risk factor
- π― Risk gauge: Visual representation of overall heart disease risk
- π‘ Personalized recommendations: Tailored health advice based on risk factors
The model is trained on a comprehensive dataset of heart disease cases with the following steps:
- Data preprocessing and normalization
- Feature engineering and selection
- Model training with cross-validation
- Hyperparameter optimization
- Performance evaluation on test data
This project was developed by Samyak Kamble as a comprehensive heart disease risk assessment tool combining medical knowledge with advanced machine learning techniques.
This project is licensed under the MIT License - see the LICENSE file for details.
For questions, suggestions, or collaborations, please contact:
- Samyak Kamble
- Email: samyak.kamble@example.com
- LinkedIn: linkedin.com/in/samyak-kamble
β Star this repository if you find it useful! β