Robust principal component analysis via Principal Component Pursuit (PCP) with scikit-learn transformer interface.
pip install skpcpPrincipal Component Pursuit (PCP) is a method for decomposing a data matrix X into a low-rank component L and a sparse component S, i.e., X = L + S. The skpcp package provides an implementation of PCP with a scikit-learn compatible transformer interface.
At its core the algorithm solves the following optimization problem $$ \min_{L,S} |L|_* + \lambda |S|1 \quad \text{s.t.} \quad X = L + S $$ where $|L|*$ is the nuclear norm (sum of singular values) of L, S, and X.
We refer the users to the original paper by Candes et al. (2011) for more details: Robust Principal Component Analysis?.
import numpy as np
from skpcp import PCP
# Generate synthetic data with low-rank and sparse components
RNG = np.random.default_rng(42)
n_samples, n_features, rank = 100, 50, 5
L = np.dot(RNG.normal(size=(n_samples, rank)), RNG.normal(size=(rank, n_features))) # Low rank component
S = RNG.binomial(1, 0.1, size=(n_samples, n_features)) * RNG.normal(loc=0, scale=10, size=(n_samples, n_features)) # Sparse component
X = L + S
# Fit PCP model
pcp = PCP()
pcp.fit(X)
L_est = pcp.low_rank_ # Estimated low-rank component
S_est = pcp.sparse_ # Estimated sparse componentAlternatively you can use the fit_transform method to fit the model and obtain the low-rank component in one step:
L_est = pcp.fit_transform(X)Note that the fit method decomposes the input data matrix X into its low-rank component L_est and sparse component S_est.
The behavior of the transformmethod of PCP differs from that of a typical scikit-learn transformer, in that it accepts the same data matrix X that was used in fit. You cannot pass a new data matrix to transform, as the decomposition is specific to the input data used in fit.
Please see the examples and the API reference for more details.
The documentation is supported by Sphinx and it is hosted on GitHub pages.
To build the HTML pages locally, first make sure you have installed the package with its documentation dependencies:
uv pip install -e .[docs]then run the following:
sphinx-build docs docs/_build