Code to reproduce the analyses from the Coralysis manuscript:
António GG Sousa, Johannes Smolander, Sini Junttila, Laura L Elo (2025).
Coralysis enables sensitive identification of imbalanced cell types and states in single-cell data via multi-level integration. bioRxiv. https://doi.org/10.1101/2025.02.07.637023
All the scripts required to reproduce the Coralysis manuscript
figures described in the table below are available in this repository
under the folder scripts, with the exception of the code
from the last row, which has its own repository and respective
documentation at
elolab/scib-pipeline.
All the datasets used in the manuscript are publicly available online,
and the links are provided in the table below. A more detailed
description of the datasets and their respective references can be found
at data/code_datasets_description.tsv.
| Scripts | Description | Data |
|---|---|---|
01_figure_1.R |
R script to make Figure 1 | Assay V1; Assay V2 |
02_extdata_figure_2.R |
R script to make Supplementary Figure S1 | Assay V1; Assay V2 |
03_figure_2.R |
R script to make Figure 3B & Supplementary Figure S12 | ifnb from SeuratData (parsed) |
04_extdata_figure_3_4_5.R |
R script to make Figure 2 & Supplementary Figure S8-S11 | Figshare |
05_figure_3.R |
R script to to make Figure 5A-G,I | GitHub |
06_figure_3.py |
Python script to make Figure 5H,M & Supplementary Figure S15 | GitHub |
07_figure_4.R |
R script to make Figure 6 | panc8 from SeuratData |
08_extdata_figure_6.R |
R script to make Supplementary Figure S16,S17 | Figshare |
09_extdata_figure_7.R |
R script to make Supplementary Figure S18,S19 | Figshare |
10_extdata_figure_8.R |
R script to make Supplementary Figure S20 | ifnb from SeuratData |
11_figure_5.R |
R script to make Figure 7 | pbmcsca from SeuratData |
12_figure_6.R |
R script to make Figure 8 & Supplementary Figure S23 | Zenodo; Figshare |
13_mapref_switching_refquery.R |
R script to make Supplementary Figure S21,S22 | panc8 from SeuratData; ifnb from SeuratData; Figshare |
14_benchmark_imbalance.R |
R script to make Figure 4B & Supplementary Figure S14 | Figshare |
helper_functions.R |
R script with custom R functions | |
elolab/scib-pipeline |
Snakemake workflow to benchmark Coralysis - Figure 3C,4A & Supplementary Figure S2-S7,S13 | Figshare; ifnb from SeuratData (parsed); Figshare (parsed) |
All R scripts were run with R version 4.2.1 under the RStudio Server
environment (v.2022.07.2 Build 576), deployed through the Docker image
elolab/sctoolkit.
The analyses can be reproduced using the
Coralysis version corresponding
to the commit 47f1b3415663ee895df188f264ac4d8ad8d24c11, which can be
installed as follows:
devtools::install_github("elolab/Coralysis", ref = "47f1b3415663ee895df188f264ac4d8ad8d24c11")The remaining R package dependencies and their respective versions can
be found at the end of every R script (obtained with the R command
sessionInfo()). The R scripts were run in the order specified by the
prefix in each file name — i.e., 01_figure_1.R, followed by
02_extdata_figure_2.R, and so on.
The Python script (06_figure_3.py) was run with Python version
3.9.16, along with the packages numpy (v.1.26.4), scanpy (v.1.10.3),
and scib_metrics (v.0.5.1).
The elolab/scib-pipeline
benchmark was run with Snakemake (v.7.25.2) in a cluster environment
using Slurm (v.23.02.6) with 8 or 100 threads and 354 or 384 GB of
RAM. The respective conda environments can be found under envs
(scib-R4.1.yml
and
scib-pipeline-R4.1.yml).
The configuration files used for the benchmark are available at:
After activating the Snakemake conda environment, the benchmark was
initiated with the following commands:
# Main benchmark across 6 datasets plus ifnb dataset comprising unshared similiar cell type pairs
snakemake --configfile configs/benchmark_coralysis.yaml --cores 8 # Benchmark imbalanced cell type integration
snakemake --configfile configs/benchmark_imbalance.yaml --cores 30 The integration performance was summarised and visualised using custom R functions used in Luecken et al. (2022) and available at: https://github.com/theislab/scib-reproducibility/tree/main/visualization.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no.: 955321.