Skip to content

Full-stack Machine Learning Startup Success Predictor with 50K+ company dataset, bias-free methodology, XGBoost ensemble, Logistic Regression, SVM w/ RBF kernel, and SHAP interpretability. Built w/ FastAPI backend, React/Next.js frontend, and end to end ML & DS pipeline. Python, TypeScript, Tailwind CSS, numpy, pandas, scikit-learn, and more.

License

Notifications You must be signed in to change notification settings

RyanFabrick/ML-Startup-Success-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

56 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Machine Learning Startup Success Predictor

A full-stack machine learning application that predicts startup success using over 50,000+ company data points spanning 1990-2015. Built with peer reviewed academic validation methodology and powered by XGBoost. Prior to full-stack implementation, comprehensive analysis was conducted through five documented notebooks: exploratory data analysis, preprocessing and feature engineering, modeling development, performance evaluation, and production pipeline setup.

Table of Contents

Frontend, Backend, Data, & Notebook READMEs (More Detail & Visual Examples)

For more comprehensive, specific, and thorough documentation and examples:

Overview

This project implements and extends the bias free startup success prediction methodology from Ε»bikowski & Antosiuk (2021). This repository provides:

  • Machine Learning Models: XGBoost, Logistic Regression, and SVM with documentation, analysis, and evaluation
  • Interactive Web Application: React/Next.js frontend with FastAPI backend
  • Model Interpretability: SHAP explanations for individual predictions
  • Academic Validation: Reproduces and extends published research methodology

Key Results (XGBoost Model)

  • F1-Score: 29.1%
  • AUC-ROC: 79.0%
  • Recall: 38.8%
  • Precision: 23.4%

Why Did I Build This?

As a Statistics and Data Science student at UCSB, I wanted to create a project that goes beyond coursework. My background and interests lie around machine learning, artificial intelligence, data science, and software engineering. I set out to build something that's academically rigorous, professionally relevant, and personally meaningful.

Startups fascinate me. They combine innovation, data, and uncertainty. This is the perfect space to apply machine learning. I came across an academic paper that used a bias-free ML approach to predict startup success, and I saw an opportunity: What if I could not only replicate that research but extend it with different techniques, real world applications, and a full stack production-ready interface?

This project became my way of learning how to build an end to end machine learning pipeline, from raw data and literature review to model deployment and interactive demo. I performed exploratory data analysis, built reusable preprocessing pipelines, engineered high-value features, trained and evaluated multiple models, and explored the business implications of different success definitions. I also integrated explainable AI using SHAP, conducted temporal validation across decades, and compared academic versus venture capital perspectives on startup success.

While I had previously built full stack web applications and retrieval augmented generation (RAG) systems, this project was an opportunity to go deeper. I challenged myself to learn new tools like FastAPI for backend development, Next.js for a polished frontend, and Tailwind CSS for rapid UI design. It pushed me to improve as a student aiming to work around data, machine learning, software development, and artifical intelligence!

Key Features

Machine Learning Pipeline

  • 22 Engineered Features across geographic, industry, and temporal dimensions
  • Bias Prevention using only founding-time information
  • Cross-Validation with 5-fold stratified approach
  • SHAP Integration for model interpretability

Web Application

  • Real-time Predictions with confidence intervals
  • Interactive UI with searchable dropdowns for 750+ regions/cities
  • Multi-category Selection from 15 industry categories
  • Visual Explanations showing key success factors

System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Next.js Frontend                            β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚ β”‚ Main        β”‚ β”‚ About       β”‚ β”‚ Prediction  β”‚                 β”‚
β”‚ β”‚ Page        β”‚ β”‚ Page        β”‚ β”‚ Results     β”‚                 β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚                                                                 β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚ β”‚User         β”‚                 β”‚SHAP         β”‚                 β”‚
β”‚ β”‚Inputs       β”‚                 β”‚Results      β”‚                 β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚ 
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   FastAPI Backend                               β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚ β”‚ Prediction  β”‚ β”‚ Explanation β”‚ β”‚ Health      β”‚                 β”‚
β”‚ β”‚ Endpoint    β”‚ β”‚ Endpoint    β”‚ β”‚ Check       β”‚                 β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚                                                                 β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚ β”‚Data         β”‚ β”‚SHAP         β”‚ β”‚Model        β”‚                 β”‚
β”‚ β”‚Preprocessor β”‚ β”‚Explainers   β”‚ β”‚Loader       β”‚                 β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚ 
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Machine Learning Layer                         β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚ β”‚ XGBoost     β”‚ β”‚ Logistic    β”‚ β”‚ SVM RBF     β”‚                 β”‚
β”‚ β”‚ Model       β”‚ β”‚ Regression  β”‚ β”‚ Model       β”‚                 β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚ 
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Data Layer                                  β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚ β”‚Crunchbase   β”‚ β”‚Preprocessed β”‚ β”‚Model        β”‚                 β”‚
β”‚ β”‚Dataset      β”‚ β”‚Features     β”‚ β”‚Artifacts    β”‚                 β”‚
β”‚ β”‚(50k+)       β”‚ β”‚(22 dims)    β”‚ β”‚(.pkl files) β”‚                 β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚                                                                 β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                 β”‚
β”‚ β”‚Categories   β”‚ β”‚SHAP         β”‚                                 β”‚
β”‚ β”‚Reference    β”‚ β”‚Explainer    β”‚                                 β”‚
β”‚ β”‚Data         β”‚ β”‚Objects      β”‚                                 β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Demo GIFs

demogif1

demogif2

demogifabout

demogif3

Technology Stack

Machine Learning & Data Science

  • Python: High-level programming language for data science and machine learning
  • XGBoost: Optimized gradient boosting framework for high-performance ML models
  • Logistic Regression with Regularization: Linear classification algorithm with penalty terms to prevent overfitting
  • SVM with RBF Kernel: Support Vector Machine using radial basis function for non-linear classification
  • SHAP: Model interpretability library providing unified approach to explain predictions
  • Jupyter: Interactive computing environment for data analysis and model development
  • Pandas: Data manipulation and analysis library for structured data processing
  • NumPy: Fundamental package for numerical computing and array operations
  • Matplotlib: Comprehensive plotting library for creating static visualizations
  • Seaborn: Statistical data visualization library built on matplotlib
  • scikit-learn: Machine learning library with algorithms for classification, regression, and preprocessing

Frontend

  • React: JavaScript library for building interactive user interfaces with component-based architecture
  • Next.js: Full stack React framework with server-side rendering and routing capabilities
  • TypeScript: Typed superset of JavaScript providing static type checking and enhanced development experience
  • Tailwind CSS: Utility first CSS framework for rapid UI development with pre-built styling classes

Backend

  • FastAPI: Modern, fast web framework for building APIs with automatic documentation and type hints
  • Pydantic: Data validation library using Python type annotations for request/response schemas
  • Uvicorn: Lightning fast ASGI server for serving Python web applications in production
  • Data Processing: Pipeline to transform user input into feature vectors for trained model inference

Project Structure

ML_STARTUP_SUCCESS_PREDICTOR
β”œβ”€β”€ app/
β”‚   └── app.py
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ processed/
β”‚   β”œβ”€β”€ raw/
β”‚   └── README.md
β”œβ”€β”€ notebooks/
β”‚   β”œβ”€β”€ 01_data_exploration.ipynb
β”‚   β”œβ”€β”€ 02_data_preprocessing.ipynb
β”‚   β”œβ”€β”€ 03_modeling.ipynb
β”‚   β”œβ”€β”€ 04_evaluation.ipynb
β”‚   β”œβ”€β”€ 05_pipeline.ipynb
β”‚   └── README.md
β”œβ”€β”€ results/
β”‚   β”œβ”€β”€ figures/
β”‚   β”œβ”€β”€ models/
β”‚   └── reports/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ data_preprocessing.py
β”‚   β”œβ”€β”€ data_util.py
β”‚   └── README.md
β”œβ”€β”€ startup-predictor/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ about/
β”‚   β”‚   β”‚   └── page.tsx
β”‚   β”‚   β”œβ”€β”€ page.tsx
β”‚   β”‚   β”œβ”€β”€ favicon.ico
β”‚   β”‚   β”œβ”€β”€ globals.css
β”‚   β”‚   └── layout.tsx
β”‚   β”œβ”€β”€ node_modules/
β”‚   β”œβ”€β”€ public/
β”‚   β”‚   β”œβ”€β”€ file.svg
β”‚   β”‚   β”œβ”€β”€ globe.svg
β”‚   β”‚   β”œβ”€β”€ next.svg
β”‚   β”‚   β”œβ”€β”€ vercel.svg
β”‚   β”‚   └── window.svg
β”‚   β”œβ”€β”€ styles/
β”‚   β”‚   └── app.css
β”‚   β”œβ”€β”€ .gitignore
β”‚   β”œβ”€β”€ next.config.mjs
β”‚   β”œβ”€β”€ package.json
β”‚   β”œβ”€β”€ tailwind.config.json
β”‚   β”œβ”€β”€ tsconfig.json
β”‚   └── README.md
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ LICENSE
β”œβ”€β”€ .env
β”œβ”€β”€ .gitattributes
└── .gitignore

Quick Start

Prerequisites

  • Python 3.8+S
  • Node.js 16+
  • pip and npm

1. Clone Repository

git clone https://github.com/RyanFabrick/Startup-Success-Prediction.git
cd Startup-Success-Prediction

2. Backend Setup

# Install Python dependencies
pip install -r requirements.txt

# Start FastAPI server
cd app
python app.py
# Server runs on http://localhost:8000

3. Frontend Setup

# Install Node dependencies
cd startup-predictor
npm install

# Start development server
npm run dev
# Application runs on http://localhost:3000

4. API Health Check

curl http://localhost:8000/health

Environment Variables

The application requires environment variables to be configured for proper operation.

# Environment (.env)
cp .env.example .env
# Configure settings as needed

Notebooks

The complete data process and analysis is documented across five notebooks:

  1. 01_EDA
  2. 02_Preprocessing_&_Feature_Engineering
  3. 03_Modeling
  4. 04_Evaluation
  5. 05_Pipeline_Setup

Each notebook is self contained with thorughly detailed documentation for each step and can be run independently. Go to the Notebooks README for more information.

Methodology & Academic Foundation

Research Validation

Based on "A machine learning, bias-free approach for predicting business success using Crunchbase data" (Ε»bikowski & Antosiuk, 2021). In my implementation I attempt to:

  • Reproduces the original bias-free methodology
  • Extends with enhanced feature engineering (22 vs 8 features)
  • Validates across multiple economic cycles (1995-2015)
  • Compares academic vs practical success definitions

Feature Engineering

  • Geographic Factors (3): Region/city startup density rankings, US indicator
  • Industry Categories (15): Binary encoding for major startup sectors
  • Temporal Features (4): Standardized founding year, economic era classification

Success Definition

Academic Success: Company acquired OR (still operating AND reached Series B funding)

  • Eliminates look ahead bias by using only founding time features
  • Focuses on observable outcomes rather than subjective metrics

Overall Model Performances

Model Precision Recall F1-Score AUC-ROC
Logistic Regression 0.169 0.709 0.273 0.781
SVM (RBF) 0.155 0.689 0.252 0.740
XGBoost 0.234 0.388 0.291 0.790
Academic Target 0.570 0.340 0.430 NAN

Use Cases

For Entrepreneurs

  • Validate business ideas against historical success patterns
  • Identify key risk factors before launching
  • Benchmark against similar companies

For Investors

  • Screen opportunities with data driven insights
  • Supplement due diligence with quantitative analysis
  • Understand geographic and industry trends

For Students and Researchers

  • Academic validation of published methodologies
  • Study startup ecosystem patterns
  • Explore bias-free prediction techniques

API Documentation

Core Endpoints

  • POST /predict - Basic success prediction
  • POST /predict/explain - Prediction with SHAP explanations
  • GET /categories - Available industry categories
  • GET /regions - Searchable region list
  • GET /cities - Searchable city list
  • GET /health - System status

Example Request

import requests

data = {
    "country_code": "USA",
    "region": "SF Bay Area",
    "city": "San Francisco",
    "category_list": "software mobile",
    "founded_year": 2010
}

response = requests.post("http://localhost:8000/predict/explain", json=data)
prediction = response.json()

Academic Context

Literature Foundation

This project validates and extends the methodology from:

Ε»bikowski, K., & Antosiuk, P. (2021). A machine learning, bias-free approach for predicting business success using Crunchbase data. Information Processing and Management, 58(4), 102555.

This study presents an academically and technically comprehensive machine learning approach to predict startup success while explicitly addressing the look ahead bias problem that plagues most existing research in this domain. The authors analyzed 213,171 companies from the Crunchbase database to develop practically applicable prediction models. While numerous studies have attempted to predict business success using machine learning, they typically suffer from methodological flaws that make their results impractical for actual investment decisions.

This research establishes a new standard for startup success prediction by prioritizing practical applicability over theoretical performance, providing a valuable tool for data-driven investment decisions while advancing our understanding of entrepreneurial success factors. I used it as both context and inspiration for this project!

Key Contributions

  1. Independent validation using separate dataset
  2. Enhanced feature engineering with funding progression metrics
  3. Temporal robustness across multiple economic cycles
  4. Production deployment with interactive explanations

Contributing

This project was developed as a personal learning project. For future questions and/or suggestions:

  1. Open an issue describing the enhancement or bug
  2. Fork the repository and create a feature branch
  3. Follow coding standards
  4. Write tests for new functionality
  5. Update documentation as needed
  6. Submit a pull request with detailed description of changes

License

This project is open source and available under the MIT License.

Author

Ryan Fabrick

Acknowledgments & References

  • Ε»bikowski, K., & Antosiuk, P. (2021) - "A machine learning, bias-free approach for predicting business success using Crunchbase data." Information Processing and Management, 58(4), 102555
  • Crunchbase - Startup and company database providing the 50,000+ company dataset for model training and validation
  • XGBoost - Optimized distributed gradient boosting library where machine learning algorithims are implemented under
  • scikit-learn - Machine learning library providing preprocessing, modeling, and evaluation tools including logistic regression and SVM implementations
  • Logistic Regression - Linear classification algorithm using logistic function for binary and multiclass prediction with probabilistic outputs
  • Support Vector Machine (SVM) with RBF Kernel - Non-linear classification algorithm using radial basis function kernel for complex decision boundaries
  • SHAP - (SHapley Additive exPlanations) Model interpretability library enabling prediction explanations
  • Pandas Community - Data manipulation and analysis library
  • NumPy Community - Fundamental package for scientific computing
  • Jupyter Project - Interactive computing environment for data analysis, processing, modeling, evaluation, and documentation
  • FastAPI - Modern, fast web framework for building APIs with Python
  • Uvicorn - Lightning fast ASGI server for Python web applications
  • Pydantic - Data validation library using Python type annotations
  • React Community - JavaScript library for building interactive user interfaces
  • Next.js Community - React framework enabling full stack web applications
  • Tailwind CSS - Utility first CSS framework for rapid UI development

Built with ❀️ for the machine learning community

This personal project demonstrates my machine learning engineering skills, full stack development capabilities, and academic research validation. As a UCSB student, I designed this as an end to end showcase of my technical abilities across the complete ML pipeline - from literature review and data processing & analysis to model deployment and production ready web applications. It highlights my skills in ML algorithms, bias aware methodological design, model interpretability with SHAP, academic research validation, modern web development, and my passion for building data driven solutions and tools for entrepreneurs, investors, reseachers, and students.

About

Full-stack Machine Learning Startup Success Predictor with 50K+ company dataset, bias-free methodology, XGBoost ensemble, Logistic Regression, SVM w/ RBF kernel, and SHAP interpretability. Built w/ FastAPI backend, React/Next.js frontend, and end to end ML & DS pipeline. Python, TypeScript, Tailwind CSS, numpy, pandas, scikit-learn, and more.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published