This project predicts NBA game winners and totals (over/under) using team stats and sportsbook odds. It pulls team data from 2007-08 through the current season, builds matchup features, and runs trained models to estimate win probabilities and totals outcomes. It also outputs expected value and optional Kelly Criterion stake sizing.
- Moneyline and totals predictions (XGBoost and Neural Net models).
- Expected value calculation and optional Kelly Criterion sizing.
- Odds ingest from supported sportsbooks or manual input.
- Data processing pipeline and model training scripts.
- Flask web app for browsing outputs.
- Collect stats and odds:
Get_Datapulls daily team stats from NBA endpoints and stores them in SQLite.Get_Odds_Datapulls sportsbook odds and scores from SBR and stores them in a separate SQLite DB. - Build game features:
Create_Gamesmerges team stats, odds, scores, and days-rest into a training dataset. - Train models: XGBoost/NN scripts in
src/Train-Modelsfit moneyline and totals models. - Predict today:
main.pyfetches today’s schedule, builds matchup features, loads trained models, and prints predictions, expected value, and optional Kelly Criterion sizing.
- Python 3.11
- Packages: Tensorflow, XGBoost, NumPy, Pandas, Colorama, Tqdm, Requests, Scikit-learn
Install dependencies:
pip3 install -r requirements.txtpython3 main.py -xgb -odds=fanduelOdds will be fetched automatically when -odds is provided. Supported books:
fanduel, draftkings, betmgm, pointsbet, caesars, wynn, bet_rivers_ny
If -odds is omitted, the script will prompt for manual odds and totals.
Optional flags:
-nnrun neural network model-xgbrun XGBoost model-Arun all models-kcshow Kelly Criterion bankroll fraction
cd Flask
flask --debug run# Create/update datasets
cd src/Process-Data
python -m Get_Data
python -m Get_Odds_Data
python -m Create_Games
# Train models
cd ../Train-Models
python -m XGBoost_Model_ML --dataset dataset_2012-26 --trials 100 --splits 5 --calibration sigmoid
python -m XGBoost_Model_UO --dataset dataset_2012-26 --trials 100 --splits 5 --calibration sigmoid
python -m NN_Model_ML
python -m NN_Model_UO
python -m Logistic_Regression_ML --dataset dataset_2012-26_new --trials 50 --splits 5 --calibration sigmoid
python -m Logistic_Regression_UO --dataset dataset_2012-26_new --trials 50 --splits 5 --calibration sigmoid- The current NN training scripts are the original versions with hard-coded dataset and model paths.
- They train on
dataset_2012-24_newand save intoModels/with timestamped names. - If you want configurable flags or feature/scaler sidecars, switch back to the newer NN scripts.
Get_Data normally fetches only new dates in the current season. To fill missing dates:
cd src/Process-Data
python -m Get_Data --backfillTo backfill a single season:
cd src/Process-Data
python -m Get_Data --backfill --season 2025-26Get_Odds_Data normally fetches only new dates in the current season. To fill missing odds dates:
cd src/Process-Data
python -m Get_Odds_Data --backfillTo backfill a single season:
cd src/Process-Data
python -m Get_Odds_Data --backfill --season 2025-26Contributions are welcome. If you change model behavior or data pipelines, add a note in the README and update any related scripts or docs.

