Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the version of Python and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.11"

# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py
fail_on_warning: false

# Optionally build your docs in additional formats such as PDF and ePub
formats:
- pdf
- epub

# Install documentation dependencies
python:
install:
- requirements: docs/requirements.txt
46 changes: 46 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.3.0] - 2025-01-07

### Added
- `CDFSplineCalibrator` for multi-class probability calibration (based on Gupta et al. 2021 ICLR paper)
- Comprehensive Sphinx documentation
- Read the Docs configuration
- Examples for calibration usage
- Type hints throughout the codebase
- Project management files (CONTRIBUTING.md, CHANGELOG.md, MANIFEST.in, setup.cfg)
- Makefile for common development tasks
- Improved README with badges and better structure

### Changed
- Restructured package: moved estimators to `splinator.estimators` submodule
- Updated examples to use new API
- Enhanced project metadata in pyproject.toml

### Fixed
- Fixed typo in project description ("spine" → "spline")

## [0.2.0] - 2024-01-XX

### Changed
- Migrated from PDM to Hatchling as build backend
- Updated dependency version constraints

### Added
- Additional example notebooks
- Metrics module with calibration metrics

## [0.1.0] - Initial Release

### Added
- `LinearSplineLogisticRegression` estimator
- Basic scikit-learn compatibility
- Initial test suite
- Example notebooks
80 changes: 80 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Contributing to Splinator

Thank you for your interest in contributing to Splinator! This document provides guidelines and instructions for contributing.

## Development Setup

1. **Fork and clone the repository**
```bash
git clone https://github.com/yourusername/splinator.git
cd splinator
```

2. **Create a virtual environment**
```bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
```

3. **Install the package in development mode**
```bash
pip install -e ".[dev]"
```

## Code Style

- Follow PEP 8 style guidelines
- Use type hints where possible
- Add docstrings to all public functions and classes (NumPy style)
- Maximum line length: 120 characters

## Testing

Run tests before submitting PR:
```bash
pytest tests/
```

Run with coverage:
```bash
pytest --cov=splinator tests/
```

## Type Checking

Run mypy for type checking:
```bash
mypy src/splinator
```

## Submitting Changes

1. Create a feature branch
```bash
git checkout -b feature/your-feature-name
```

2. Make your changes and commit with clear messages
```bash
git commit -m "Add feature: brief description"
```

3. Push to your fork and create a Pull Request

## Pull Request Guidelines

- Include tests for new functionality
- Update documentation if needed
- Ensure all tests pass
- Add a clear description of changes
- Reference any related issues

## Reporting Issues

- Use GitHub Issues to report bugs
- Include Python version and minimal reproducible example
- Describe expected vs actual behavior

## Questions?

Feel free to open an issue for any questions about contributing!
9 changes: 9 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
include LICENSE
include README.md
include pyproject.toml
recursive-include docs *.rst *.txt *.py
recursive-include tests *.py
recursive-include examples *.py *.ipynb
global-exclude __pycache__
global-exclude *.py[co]
global-exclude .DS_Store
53 changes: 53 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
.PHONY: help install install-dev test test-cov lint format type-check docs clean build

help:
@echo "Available commands:"
@echo " install Install the package"
@echo " install-dev Install the package with development dependencies"
@echo " test Run tests"
@echo " test-cov Run tests with coverage report"
@echo " lint Run linting (flake8)"
@echo " format Format code with black and isort"
@echo " type-check Run type checking with mypy"
@echo " docs Build documentation"
@echo " clean Clean build artifacts"
@echo " build Build distribution packages"

install:
pip install -e .

install-dev:
pip install -e ".[dev,docs]"

test:
pytest tests/

test-cov:
pytest --cov=splinator --cov-report=html --cov-report=term tests/

lint:
flake8 src/ tests/

format:
black src/ tests/
isort src/ tests/

type-check:
mypy src/splinator

docs:
cd docs && make clean && make html

clean:
rm -rf build/
rm -rf dist/
rm -rf *.egg-info
rm -rf .coverage
rm -rf htmlcov/
rm -rf .pytest_cache/
rm -rf .mypy_cache/
find . -type d -name __pycache__ -exec rm -rf {} +
find . -type f -name "*.pyc" -delete

build: clean
python -m build
125 changes: 87 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,66 +1,115 @@
# Splinator 📈

**Probablistic Calibration with Regression Splines**
**Probabilistic Calibration with Regression Splines**

[scikit-learn](https://scikit-learn.org) compatible
A scikit-learn compatible Python library for probability calibration of machine learning models using spline-based methods.

[![pdm-managed](https://img.shields.io/badge/pdm-managed-blueviolet)](https://pdm.fming.dev)
[![PyPI version](https://badge.fury.io/py/splinator.svg)](https://badge.fury.io/py/splinator)
[![Documentation Status](https://readthedocs.org/projects/splinator/badge/?version=latest)](https://splinator.readthedocs.io/en/latest/)
[![Build](https://img.shields.io/github/actions/workflow/status/affirm/splinator/.github/workflows/python-package.yml)](https://github.com/affirm/splinator/actions)
[![Build Status](https://img.shields.io/github/actions/workflow/status/affirm/splinator/.github/workflows/python-package.yml)](https://github.com/affirm/splinator/actions)
[![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![Python Version](https://img.shields.io/pypi/pyversions/splinator)](https://pypi.org/project/splinator/)

## Features

- **Linear Spline Logistic Regression**: Flexible non-linear classification with automatic knot placement
- **CDF Spline Calibration**: State-of-the-art probability calibration for multi-class classifiers
- **scikit-learn Compatible**: Seamless integration with existing ML pipelines
- **Comprehensive Metrics**: Calibration evaluation tools including ECE and Spiegelhalter's z-statistic

## Installation

`pip install splinator`
```bash
pip install splinator
```

For development installation with extra dependencies:
```bash
pip install splinator[dev]
```

## Quick Start

## Algorithm
### Linear Spline Logistic Regression
```python
from splinator.estimators import LinearSplineLogisticRegression
import numpy as np

Supported models:
# Generate sample data
n_samples = 1000
X = np.random.randn(n_samples, 2)
y = (X[:, 0] + 0.5 * X[:, 1] > 0).astype(int)

- Linear Spline Logistic Regression
# Fit model with automatic knot selection
model = LinearSplineLogisticRegression(n_knots=10)
model.fit(X, y)

# Make predictions
predictions = model.predict(X)
probabilities = model.predict_proba(X)
```

Supported metrics:
### Probability Calibration
```python
from splinator.estimators import CDFSplineCalibrator
from sklearn.model_selection import train_test_split

- Spiegelhalter’s z statistic
- Expected Calibration Error (ECE)
# Split data for calibration
X_train, X_cal, y_train, y_cal = train_test_split(X, y, test_size=0.2)

\[1\] You can find more information in the [Linear Spline Logistic
Regression](https://github.com/Affirm/splinator/wiki/Linear-Spline-Logistic-Regression).
# Train base model and calibrator
base_model = LinearSplineLogisticRegression().fit(X_train, y_train)
calibrator = CDFSplineCalibrator()
calibrator.fit(base_model.predict_proba(X_cal), y_cal)

\[2\] Additional readings
# Apply calibration
calibrated_probs = calibrator.transform(base_model.predict_proba(X_test))
```

- Zhang, Jian, and Yiming Yang. [Probabilistic score estimation with
piecewise logistic
regression](https://pal.sri.com/wp-content/uploads/publications/radar/2004/icml04zhang.pdf).
Proceedings of the twenty-first international conference on Machine
learning. 2004.
- Guo, Chuan, et al. "On calibration of modern neural networks." International conference on machine learning. PMLR, 2017.
## Documentation

Full documentation is available at [splinator.readthedocs.io](https://splinator.readthedocs.io/).

## Examples

| comparison | notebook |
|------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| scikit-learn's sigmoid and isotonic regression | [![colab1](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/Affirm/splinator/blob/main/examples/calibrator_model_comparison.ipynb) |
| pyGAM’s spline model | [![colab2](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/Affirm/splinator/blob/main/examples/spline_model_comparison.ipynb) |
Interactive notebooks demonstrating various features:

## Development
| Topic | Notebook |
|-------|----------|
| Calibrator Comparison | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/Affirm/splinator/blob/main/examples/calibrator_model_comparison.ipynb) |
| Spline Model Comparison | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/Affirm/splinator/blob/main/examples/spline_model_comparison.ipynb) |

The dependencies are managed by [pdm](https://pdm.fming.dev/latest/)
## Contributing

To run tests, run `pdm run -v pytest tests`
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

## Example Usage
## Development

``` python
from splinator.estimators import LinearSplineLogisticRegression
import numpy as np
1. Clone the repository
2. Install in development mode: `pip install -e ".[dev]"`
3. Run tests: `pytest tests/`
4. Check types: `mypy src/splinator`
5. Format code: `black src/ tests/`

# random synthetic dataset
n_samples = 100
rng = np.random.RandomState(0)
X = rng.normal(loc=100, size=(n_samples, 2))
y = np.random.randint(2, size=n_samples)
## Citation

lslr = LinearSplineLogisticRegression(n_knots=10)
lslr.fit(X, y)
If you use splinator in your research, please cite:

```bibtex
@software{splinator,
title = {Splinator: Probabilistic Calibration with Regression Splines},
author = {Xu, Jiarui},
year = {2024},
url = {https://github.com/affirm/splinator}
}
```

## References

- Zhang, J., & Yang, Y. (2004). [Probabilistic score estimation with piecewise logistic regression](https://pal.sri.com/wp-content/uploads/publications/radar/2004/icml04zhang.pdf). In Proceedings of the twenty-first international conference on Machine learning.
- Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). [On calibration of modern neural networks](https://arxiv.org/abs/1706.04599). In International conference on machine learning (pp. 1321-1330). PMLR.
- Gupta, C., Koren, A., & Mishra, K. (2021). [Calibration of Neural Networks using Splines](https://arxiv.org/abs/2006.12800). In International Conference on Learning Representations (ICLR). Official implementation: [kartikgupta-at-anu/spline-calibration](https://github.com/kartikgupta-at-anu/spline-calibration).

## License

This project is licensed under the BSD 3-Clause License - see the [LICENSE](LICENSE) file for details.
4 changes: 2 additions & 2 deletions docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
Expand Down
Loading
Loading