This repository implements an RNA-seq pipeline, which is able to:
- map reads in
.fastq.gzformat with HISAT2 - quantify gene expression
- produce quality control plots with R
This pipeline should be run in our ngs container
-
Make sure
docker(https://www.docker.com) is installed on your system and is running. -
Download source code of this repository containing the rnaseq pipeline. Alternatively, if git is installed on your system, clone it with the command:
git clone 'https://github.com/BioinfoSupport/rnaseq.git' my_new_project
-
Run the container:
docker compose up -d -
Connect to RStudio GUI running within the container at URL http://localhost:8787
-
Run app.R to download FASTQ files from iGE3 genomic platform and run quantification. Alternatively copy your
.fastq.gzfiles into subfolderdata/fastq/. -
Run notebooks in
src/to generate QC reports
data/
| ref/ folder containing reference genome subdirectories.
| | Dd+Mm/ a reference genome folder for _Dictyostelium discoideum_ and _Mycobacterium
| | marinum_. Example genome folders can be found in our [`genomes` repository]
| | (https://github.com/BioinfoSupport/genomes/releases). If you are using a
| | known reference genome, it will be downloaded automatically from this repository.
| fastq/ folder containing sequenced reads (.fastq.gz)
| | test/ example FASTQ reads
src/
| 00_qc_genome.Rmd Notebook to compute statistics on a reference genome
| 01_qc_mapping.Rmd Notebook to extract mapping statistics
| 02_DESeq.Rmd Notebook with an example usage of the pipeline with DESeq2
.local/ hidden folder with pipeline-specific scripts
# Run the container
docker compose up -d
# Run tests
docker compose exec rnaseq make TESTS
# Map and quantify all .fastq.gz files located in data/fastq/pilot
docker compose exec rnaseq make data/fastq/pilot/RNASEQ.ALL
# Run alignment and quantification on FASTQ in folder data/fastq on genome Dd+Mm (with automatic download of the genome)
docker compose exec rnaseq make GENOME=Dd+Mm data/fastq/test/RNASEQ.ALL
# Get a Bash in the container
docker compose exec rnaseq bash
# Stop the container
docker compose down
# Run container at the commandline without compose
docker run --rm -v ./:/cwd --workdir /cwd unigebsp/ngs make -n
# Download fastq from ige3 plateform to data/fastq
wget -P ./data/fastq --content-disposition --trust-server-names -i 'https://data.ige3.genomics.unige.ch/dataset/download/xxxxxxx.txt'singularity exec 'docker://unigebsp/ngs' make GENOME=Dd+Mm data/fastq/test/all
-
Use uppercase for .PHONY rules (e.g. %.fastq.gz.ALL, %.bam.ALL, %.BWAMEM, %.HT2)
-
Try to add .ALL prefix to .PHONY rules that are fast to compute