ML-Powered Volatility-Targeted Investment Fund

A machine learning-driven investment fund implementing adaptive volatility targeting through regime detection and predictive modeling. This project combines Hidden Markov Models (HMM) and Long Short-Term Memory (LSTM) networks to dynamically adjust portfolio volatility targets based on predicted market regimes.

Course: BMF5360 - Applied Machine Learning in Investments
Authors: Christian Masek, Cedric McKeever, Pratardan Agarwal, Parinistha Narula
Institution: National University of Singapore

Note to prof

You can find the important notebooks under

General data cleaning: data/notebooks/
Ridge Regression Volatility component: src/vol/dev_vol_predictor.ipynb
LSTM Notebook: src/lstm_model.ipynb
HMM Notebook: src/HMM/notebook.ipynb

📁 Repository Structure

ML_project/
├── src/                          # Source code
│   ├── HMM/                      # Hidden Markov Model implementation
│   │   ├── model.py              # HMM regime detection model
│   │   ├── features.py           # Feature engineering for HMM
│   │   └── README.md             # Detailed HMM documentation
│   ├── LSTM/                     # LSTM volatility prediction
│   │   ├── main.py               # LSTM model implementation
│   │   └── README.md             # LSTM feature documentation
│   ├── vol/                      # Volatility targeting strategies
│   │   └── targetting.py         # Leverage and volatility targeting
│   ├── utils/                    # Utility functions
│   │   ├── analytics.py          # Performance analytics
│   │   ├── logger.py             # Logging configuration
│   │   └── utils.py              # Common utilities
│   └── generate.py               # Main script to generate results
├── data/                         # Data directory
│   ├── raw/                      # Raw data files
│   ├── cleaned/                  # Processed data
│   └── notebooks/                # Exploratory data analysis notebooks
├── materials/                    # Project deliverables
│   ├── final/                    # Final report (LaTeX)
│   └── midterm_update/           # Midterm presentation materials
├── pyproject.toml                # Poetry dependency management
└── TASK.md                       # Project guidelines and requirements

🚀 Getting Started

Prerequisites

Python 3.12 or higher
Poetry for dependency management

Installation

Clone the repository:

git clone https://github.com/McKeev/ML_project_submission.git
cd ML_project

Install dependencies using Poetry:
```
poetry install
```
Activate the virtual environment:
```
poetry env activate
```

Quick Start

Generate all tables and figures for the final report:

poetry run generate

This command will:

Run the HMM regime detection model
Execute LSTM volatility predictions
Perform backtests on volatility-targeted strategies
Generate LaTeX tables and figures in materials/final/

📊 Data Sources

The project uses a comprehensive dataset spanning 2003-2025, sourced from:

Bloomberg Terminal: VIX, Implied Correlations (3m)
Refinitiv: SPY historical prices and returns
Federal Reserve Economic Data (FRED):
- Treasury yields (2Y, 5Y, 10Y)
- Overnight rates (SOFR)
- ICE BofA High Yield Spreads
- Economic Market Volatility Index (EMVMACROBUS)
Yahoo Finance: Additional market data
Proprietary calculations: GARCH volatility, Parkinson's volatility, technical indicators

All data is preprocessed to weekly frequency (Friday closing) and stored in data/cleaned/.

🧠 Methodology

1. Hidden Markov Model (HMM) Regime Detection

The HMM identifies latent market regimes based on:

Yield curve slope (2Y-10Y Treasury spread)
Lagged returns
Realized volatility
Distance from moving averages

Key Implementation:

from src.HMM.model import HMMRegimePredictor, rolling_window_predict
from src.HMM.features import features_df

# Load features
features = features_df(['SLOPE_2Y_10Y', 'LRET', 'RealVol', 'DIST_MA3m'])

# Initialize and fit model
model = HMMRegimePredictor(n_regimes=3)
model.prepare_data(features).fit()

# Generate volatility targets
vol_results = model.get_vol_targets(base_vol_target=10)

2. Volatility Targeting

Dynamic leverage adjustment to maintain target volatility levels:

from src.vol import backtest_target

# Backtest with adaptive volatility targets
results = backtest_target(vol_target_series)
print(results.performance_table())

3. Walk-Forward Validation

All models use strict walk-forward analysis:

Training Window: Rolling historical data
Test Window: 2020-01-01 to 2025-08-31
No Look-Ahead Bias: Predictions use only past information

📈 Performance Metrics

The project evaluates strategies using:

Sharpe Ratio: Risk-adjusted returns
Maximum Drawdown: Peak-to-trough decline
Volatility: Realized annualized volatility

Results are automatically generated in LaTeX format for academic reporting.

🔧 Key Dependencies

Core libraries (managed via Poetry):

pandas = ">=2.3.3"           # Data manipulation
numpy = ">=2.3.3"            # Numerical computing
scikit-learn = ">=1.7.2"     # Machine learning utilities
hmmlearn = ">=0.3.3"         # Hidden Markov Models
matplotlib = ">=3.10.7"      # Visualization
yfinance = ">=0.2.66"        # Market data
arch = ">=8.0.0"             # GARCH models
polars = ">=1.34.0"          # High-performance data frames

📝 LaTeX Report Generation

The project includes automated LaTeX report generation:

Navigate to the materials directory:
```
cd materials/final
```
Build the PDF report:
```
make
```

The report includes:

Comprehensive methodology description
Auto-generated performance tables
Publication-quality figures
Complete bibliography

Format Requirements:

8-12 pages (excluding cover, TOC, references)
Times New Roman, 11pt, 1.5 line spacing
1-inch margins

📚 Documentation

Detailed documentation is available for each component:

HMM Module: src/HMM/README.md - Complete HMM API and examples
LSTM Module: src/LSTM/README.md - Feature descriptions and model architecture
Data Notes: data/data_notes.md - Data sources and preprocessing
Final Report: materials/final/README.md - LaTeX compilation instructions

Note: This README provides a high-level overview. For detailed technical documentation, please refer to the individual module README files and code comments.

Name		Name	Last commit message	Last commit date
Latest commit History 186 Commits
data		data
materials		materials
src		src
.gitignore		.gitignore
Group1_Final_Report.pdf		Group1_Final_Report.pdf
README.md		README.md
TASK.md		TASK.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-Powered Volatility-Targeted Investment Fund

Note to prof

📁 Repository Structure

🚀 Getting Started

Prerequisites

Installation

Quick Start

📊 Data Sources

🧠 Methodology

1. Hidden Markov Model (HMM) Regime Detection

2. Volatility Targeting

3. Walk-Forward Validation

📈 Performance Metrics

🔧 Key Dependencies

📝 LaTeX Report Generation

📚 Documentation

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

McKeev/ML_project

Folders and files

Latest commit

History

Repository files navigation

ML-Powered Volatility-Targeted Investment Fund

Note to prof

📁 Repository Structure

🚀 Getting Started

Prerequisites

Installation

Quick Start

📊 Data Sources

🧠 Methodology

1. Hidden Markov Model (HMM) Regime Detection

2. Volatility Targeting

3. Walk-Forward Validation

📈 Performance Metrics

🔧 Key Dependencies

📝 LaTeX Report Generation

📚 Documentation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages