A hybrid framework combining econometric structure with deep learning for volatility prediction.
This project integrates GARCH models for volatility clustering with LSTM networks for learning non-linear temporal patterns. It applies evidence-based testing to ensure statistical validity rather than relying on anecdotal backtests.
A more complete overview on the project is available on Beyond Black Boxes: A Framework for Building Rigorous AI for Volatility Forecasting
- GARCH captures conditional heteroskedasticity and persistence
- LSTM models complex temporal dependencies
- Monte Carlo permutation tests validate statistical significance
- Technical indicators enrich predictive features
python -m venv <env>
source <env>/bin/activate # Mac/Linux
pip install -r requirements.txt1. Data Processing
python process_dataset.pyDownloads and processes stock data (1990–2020), computing technical indicators, rolling volatility, GARCH predictions, and statistical moments.
2. Exploration
python explore_dataset.pyPerforms quality checks and distributional analysis.
3. Training
python train.pyTrains probabilistic LSTM models with early stopping and calibration monitoring.
4. Validation
python test.pyRuns permutation and Brownian surrogate tests to verify significance.
Inspired by David Aronson’s Evidence-Based Technical Analysis, this framework treats trading research as a scientific hypothesis test.
Key ideas:
- Null Hypothesis — every model is assumed random until statistically rejected.
- Out-of-Sample Validation — only consistent rejection across tests counts.
- Multiple Hypothesis Control — ensures feature importance isn’t a fluke.
This approach guards against overfitting and data mining bias.
- Returns: log returns
- GARCH Forecasts: 1-step rolling volatility
- Indicators: RSI, MACD, SMA/EMA
- Moments: skewness, kurtosis, Hurst exponent
- Microstructure: volume & range dynamics
- Probabilistic LSTM with Gaussian mixture outputs
- 30-lag lookback, 4-step sequence horizon
- Dropout 0.5, early stopping with patience=10
| Test | Purpose | Description |
|---|---|---|
| Permutation | Null rejection | Randomly shuffles volatility targets to create null distribution for MAE |
| Brownian Surrogate | Temporal control | Uses Brownian motion surrogates to test if patterns come from serial dependence |
| Rolling Window | Temporal stability | Evaluates significance across time windows |
| Feature Shuffle | Feature importance | Quantifies predictive contribution of each feature |
Significant results (p < 0.05):
Permutation: 6/8 stocks
Brownian: 5/8 stocks
- Log Returns and GARCH forecasts most critical
- Temporal stability >90% across windows
- Strong out-of-sample calibration (σ/MAE ≈ 1–2)
dataset/processed/— processed datacheckpoints/<ticker>/— model weightstest_images/<ticker>/— plots & diagnostics