This repository contains the code for the paper Split Conformal Prediction and Non-Exchangeable Data.
- All scripts should be run from the root directory (
split-conformal-nonexchangeable/). - Packages listed in
requirements.txtshould be installed via your python package manager of choice. - Python 3.11 is required.
- src: project source code (pushed to the repository).
- data: raw and processed datasets.
- eval: plots, tables and other output used to evaluate conformal methods.
We give below a simple example of how the code in this repository can be used for data generation, conformal prediction and coverage evaluation.
from src.models import ConformalQR, RandomForestQR
from src.utils.data import get_synthetic
from src.utils.eval import empirical_coverage
# Generate data from a nonexchangeable autoregressive process
data = get_synthetic("ar1", N=1500, lags=10, seed=0, phi=0.8)
# Split the data into features and target
X, y = data.drop("target", axis=1), data["target"]
# Partition into training, calibration and test sets
X_train, y_train = X[:500], y[:500]
X_cal, y_cal = X[500:1000], y[500:1000]
X_test, y_test = X[1000:], y[1000:]
# Fit, calibrate and generate prediction intervals for a prescribed coverage level of 90%
cqr = ConformalQR(Model=RandomForestQR, alpha=0.1, seed=0)
cqr.fit(X_train, y_train)
cqr.calibrate(X_cal, y_cal)
y_pred_lower, y_pred_upper = cqr.predict(X_test)
# Evaluate coverage
empirical_coverage(y_test, y_pred_lower, y_pred_upper)
- Run
sh src/paper/synthetic-marginal_coverage.shto generate the results. - Run
python src/eval/synthetic/plot-marginal_coverage.py --stochastic_process ar1to plot. - Run
python src/eval/synthetic/plot-marginal_coverage.py --stochastic_process two_state_markov_chainto plot.
- Run
python src/data/download_temp.py -d 19790101-20221231to download the data. - Run
python src/data/process_temp.pyto generate the climatology datasets.
- Run
python src/eval/climatology/split_cp.pyto generate the results. - Run
python src/eval/climatology/plot-split_cp-maps.pyto plot.
- Run
python src/eval/synthetic/plot-rolling_coverage-split_cp-enbpi.pyto plot.
- Run
sh src/paper/climatology_benchmarks.shto generate the results. - Run
python src/eval/climatology/table-benchmarks.pyto create the table.
- Run
sh src/paper/theoretical_guarantee.shto generate the results. - Run
python src/eval/synthetic/plot-theoretical_guarantee.pyto plot.
- Run
sh src/paper/synthetic-empirical_coverage.shto generate the results. - Run
python src/eval/synthetic/plot-empirical_coverage.pyto plot.
- Run
python src/data/download_forex.pyto download the data. - Run
sh src/paper/forex_datasets-process.shto generate the datasets.
- Run
python src/eval/real/marginal_coverage.py --dataset eurusdto generate the results. - Run
python src/eval/real/marginal_coverage.py --dataset bcousdto generate the results. - Run
python src/eval/real/marginal_coverage.py --dataset spxusdto generate the results. - Run
python src/eval/real/plot-marginal_coverage.pyto plot.
- Run
sh src/paper/real-conditional_coverage.shto generate the results. - Run
python src/eval/real/table-conditional_coverage.pyto create the table.