Skip to content

SamBroomy/is-it-slop

Repository files navigation

Crates.io Crates.io Downloads Docs.rs

PyPI PyPI Downloads License


is-it-slop A fast and accurate AI text detector built with Rust and Python.

A CLI tool and library, classifying whether a given text was written by AI or a human.

is-it-slop

Fast AI text detection using TF-IDF and ensemble classifiers.

Features

  • Fast: Rust-based preprocessing and ONNX inference
  • Minimal: Just a ~13 MB ML model + 3 MB vectorizer for pre-processing — no heavy transformer models or GPU required
  • Self-Contained: Single ~35 MB binary with ONNX runtime bundled. No Python, external dependencies, or network access needed at runtime
  • Robust: Trained on 15+ curated datasets
  • Accurate: 96%+ accuracy (F1 0.96, MCC 0.93)
  • Portable: ONNX model embedded in CLI binary
  • Dual APIs: Rust library + Python bindings

Installation

CLI (Rust)

cargo install is-it-slop --features cli

Model artifacts (14.1 MB zip archive) are downloaded automatically during build from GitHub releases.

Python Package

uv add is-it-slop
# or
pip install is-it-slop

Rust Library

cargo add is-it-slop

Quick Start

CLI

is-it-slop "Your text here"
# Output: 0.234 (AI probability)

is-it-slop "Text" --format class
# Output: 0 (Human) or 1 (AI)

Python

from is_it_slop import is_this_slop
result = is_this_slop("Your text here")
print(result.classification)
>>> 'Human'
print(f"AI probability: {result.ai_probability:.2%}")
>>> AI probability: 15.23%

Rust

use is_it_slop::Predictor;

let predictor = Predictor::new();
let prediction = predictor.predict("Your text here")?;
println!("AI probability: {}", prediction.ai_probability());

Architecture

Training (Python):
  Texts -> RustTfidfVectorizer -> TF-IDF -> sklearn models ->  ONNX

Inference (Rust CLI):
  Texts -> TfidfVectorizer (Rust) -> TF-IDF -> ONNX Runtime -> Prediction

Why separate artifacts?

  • Vectorizer: Fast Rust preprocessing.

Python bindings make it easy to train a model in Python and use it in Rust.

  • Model: Portable ONNX format (no Python runtime needed)

Training

See notebooks/dataset_curation.ipynb for which datasets were used. See notebooks/train.ipynb for training pipeline.

Great care was taken to use multiple diverse datasets to avoid overfitting to any single source of human or AI-generated text. Great care was also taken to avoid the underlying model just learning artifacts of specific datasets.

For more information about look in the notebooks/ directory.

License

MIT

Description

Getting Started

Dependencies

Installation