About

This repository implements an RNA-seq pipeline, which is able to:

map reads in .fastq.gz format with HISAT2
quantify gene expression
produce quality control plots with R

This pipeline should be run in our ngs container

Quick start

Make sure docker (https://www.docker.com) is installed on your system and is running.
Download source code of this repository containing the rnaseq pipeline. Alternatively, if git is installed on your system, clone it with the command:

git clone 'https://github.com/BioinfoSupport/rnaseq.git' my_new_project

Run the container: docker compose up -d
Connect to RStudio GUI running within the container at URL http://localhost:8787
Run app.R to download FASTQ files from iGE3 genomic platform and run quantification. Alternatively copy your .fastq.gz files into subfolder data/fastq/.
Run notebooks in src/ to generate QC reports

Directory structure

data/
  | ref/      folder containing reference genome subdirectories. 
  | | Dd+Mm/  a reference genome folder for _Dictyostelium discoideum_ and _Mycobacterium
  | |         marinum_. Example genome folders can be found in our [`genomes` repository]
  | |         (https://github.com/BioinfoSupport/genomes/releases). If you are using a
  | |         known reference genome, it will be downloaded automatically from this repository.
  | fastq/    folder containing sequenced reads (.fastq.gz)
  | | test/   example FASTQ reads
src/
  | 00_qc_genome.Rmd   Notebook to compute statistics on a reference genome
  | 01_qc_mapping.Rmd  Notebook to extract mapping statistics
  | 02_DESeq.Rmd       Notebook with an example usage of the pipeline with DESeq2  
.local/       hidden folder with pipeline-specific scripts

Useful commands

# Run the container 
docker compose up -d

# Run tests
docker compose exec rnaseq make TESTS

# Map and quantify all .fastq.gz files located in data/fastq/pilot
docker compose exec rnaseq make data/fastq/pilot/RNASEQ.ALL

# Run alignment and quantification on FASTQ in folder data/fastq on genome Dd+Mm (with automatic download of the genome)
docker compose exec rnaseq make GENOME=Dd+Mm data/fastq/test/RNASEQ.ALL

# Get a Bash in the container
docker compose exec rnaseq bash

# Stop the container
docker compose down

# Run container at the commandline without compose 
docker run --rm -v ./:/cwd --workdir /cwd unigebsp/ngs make -n

# Download fastq from ige3 plateform to data/fastq
wget -P ./data/fastq --content-disposition --trust-server-names -i 'https://data.ige3.genomics.unige.ch/dataset/download/xxxxxxx.txt'

Running the pipeline on a HPC cluster

singularity exec 'docker://unigebsp/ngs' make GENOME=Dd+Mm data/fastq/test/all

Conventions

Use uppercase for .PHONY rules (e.g. %.fastq.gz.ALL, %.bam.ALL, %.BWAMEM, %.HT2)
Try to add .ALL prefix to .PHONY rules that are fast to compute

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
.local		.local
data		data
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
RStudioProject.Rproj		RStudioProject.Rproj
app.R		app.R
compose.yaml		compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Quick start

Directory structure

Useful commands

Running the pipeline on a HPC cluster

Conventions

About

Uh oh!

Releases 1

Packages

Languages

BioinfoSupport/rnaseq

Folders and files

Latest commit

History

Repository files navigation

About

Quick start

Directory structure

Useful commands

Running the pipeline on a HPC cluster

Conventions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages