Skip to content

jupall/swfilter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

swfilter

A Python implementation of the Sliced-Wasserstein Filter developed by Julien Pallage and Antoine Lesage-Landry.


PyPI version Downloads

Introduction:

In this work, we present a new unsupervised anomaly (outlier) detection (AD) method using the sliced-Wasserstein metric. This filtering technique is conceptually interesting for integration in MLOps pipelines deploying trustworthy machine learning models in critical sectors like energy. We also propose an approximation of our methodology using a Fast Euclidian variation. The code is written to respect scikit-learn's API and be called similarly to other scikit-learn AD methods, e.g., Isolation Forest, Local Outlier Factor.

How it is made:

We use the Python implementation of the sliced-Wasserstein distance from the library POT and use a voting system to label candidate samples as outliers or inliers and we use joblib to parallelize the procedure.

How to use it:

For large datasets, we recommend using SmartSplitSlicedWassersteinFilter or FastEuclidianFilter to speed up computations.

from swfilter import SlicedWassersteinFilter
eps = 0.01 # the threshold of the SW distance
n = 30 # the number of voters
n_projections = 50 # the number of projections used in the SW computations
p = 0.6 # the threshold percentage of voters required to label as outlier
n_jobs = -1 # the number of workers to call in the parallelization (-1 = max)

model = SlicedWassersteinFilter(eps=eps, n=n, n_projections=n_projections, p=p, n_jobs=n_jobs, swtype='original')
preds, vote = model.fit_predict(dataset)

mask = preds == 1
filtered_dataset = dataset[mask]

Install:

pip install swfilter

Tutorial:

See our tutorial page!

link

Cite our work and read our paper:

@article{pallage2024sliced,
  title={Sliced-Wasserstein-based Anomaly Detection and Open Dataset for Localized Critical Peak Rebates},
  author={Pallage, Julien and Scherrer, Bertrand and Naccache, Salma and B{\'e}langer, Christophe and Lesage-Landry, Antoine},
  journal={arXiv preprint arXiv:2410.21712},
  year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published