Prophet

Prophet is a transformer-based regression model that predicts cellular responses by decomposing experiments into cell state, treatment, and functional readout, leveraging extensive screening datasets and scalability to significantly reduce the number of required experiments and identify effective treatments.

Model Overview

Prophet decomposes biological experiments into three key components:

Cell state - represented by cell line embeddings derived from gene expression profiles
Treatment - represented by intervention embeddings (e.g., small molecules, genetic perturbations)
Functional readout - the phenotypic measurement being predicted (e.g., viability, IC50)

The model uses a transformer architecture to learn complex interactions between these components and predict experimental outcomes without requiring the experiments to be performed.

Embeddings

Prophet uses three types of embeddings:

Cell line embeddings: 300-dimensional vectors derived from CCLE gene expression data
Intervention embeddings: 500-dimensional vectors representing small molecules or genetic perturbations
Phenotype embeddings: Representations of different readout types (optional)

These embeddings capture the biological properties of each component and allow the model to generalize across different experimental conditions.

Training

Prophet was trained on a large dataset of cellular response measurements, including:

Drug sensitivity screens (GDSC, PRISM, CTRP)
Genetic perturbation screens (DepMap, Achilles)
Combinatorial perturbation experiments

The model was trained using a masked attention mechanism to handle variable numbers of perturbations and a cosine learning rate schedule with warmup. Training was performed on NVIDIA A100 GPUs with early stopping based on validation loss.

Installation

mamba create -n prophet_env python=3.10
mamba activate prophet_env

git clone https://github.com/theislab/prophet.git
cd prophet
pip install -e .

Quick Start

from prophet import Prophet

# Load a pretrained model (automatically downloads everything)
model = Prophet.from_pretrained("base")

# Ready to predict!
predictions = model.predict(your_data)

Available Models

See all available models and configurations:

Prophet.list_models()

Prophet provides pretrained models for various datasets including:

base: General purpose pretrained model (recommended for most users)
GDSC, CTRP, PRISM: Drug sensitivity datasets
LINCS, JUMP: Gene expression perturbation datasets
Horlbeck: CRISPR screening data
And more...

Each model can be loaded with different configurations (split type, fold, seed):

# Load with specific configuration
model = Prophet.from_pretrained(
    model_name="GDSC",
    split="perturbations",  # or "cell_lines"
    fold=0,  # 0-4
    seed=110  # 110, 1995, or 2024
)

Tutorials and Examples

For detailed examples and workflows, check out:

Getting Started Tutorial - Complete walkthrough
Fine-tuning Guide - Adapt models to your data

Advanced: Manual Download

For advanced users who need direct file access, model checkpoints and embeddings are available at:

Citation

If you have used our work in your research, please cite our preprint.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.github		.github
configs		configs
embeddings		embeddings
prophet		prophet
scripts		scripts
test		test
tutorials		tutorials
.gitignore		.gitignore
INSTALLATION.md		INSTALLATION.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
prophet_mcp_server.py		prophet_mcp_server.py
requirements_mcp.txt		requirements_mcp.txt
setup.py		setup.py
train.sbatch		train.sbatch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Prophet

Model Overview

Embeddings

Training

Installation

Quick Start

Available Models

Tutorials and Examples

Advanced: Manual Download

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

theislab/prophet

Folders and files

Latest commit

History

Repository files navigation

Prophet

Model Overview

Embeddings

Training

Installation

Quick Start

Available Models

Tutorials and Examples

Advanced: Manual Download

Citation

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages