This repository contains scripts that were created during my Master's thesis including the script for the SCPP. SCPP was tested under a Linux environment and is therefore not applicable with macOS or Windows. Only the SCPP script ("SCPP.sh") and the quality control and filtering script "sc_analysis_qc.R" are necessary to start the pipeline. The other scripts can be disregarded. For beginners, the "start_SCPP.sh" script might be interesting, because it contains information about the required and the optional parameters.
SCPP is also runnable with the "start_SCPP.sh" script. To execute this script, make it executable with $ chmod +x start_SCPP.sh and execute it with $ ./start_SCPP.sh.
- Clone this repository into a directory on your computer
$ git clone https://github.com/DSchreyer/Master-Thesis.git- make SCPP.sh executable
$ chmod +x SCPP.sh- start SCPP from the command line
$ ./SCPP.sh --option1 " " --option2 ...| Option | Default | Function | Branch |
|---|---|---|---|
| --data | - | Path to directory with scRNA-seq FASTQ files | |
| --output | "./output" | Path to directory to store output files in | |
| --threads | 1 | Number of threads to use | |
| --qualityControl | "no" | Perform quality control with FastQC ["no"/"yes"] | |
| --fastqc | - | Path to executable | |
| --trimming | "no" | Perform trimming with Trimmomatic ["no"/"yes"] | |
| --trimmomatic | - | Path to executable Trimmomatic | |
| --trimOptions | - | Trimming options Trimmomatic is using E.g. "TRAILING:20 MINLEN:75" | |
| --useCellranger | "yes" | Use the CellRanger branch ["no/"yes"] | CellRanger |
| --cellranger | - | Path to executable CellRanger | |
| --cellrangerTranscriptome | - | Path to reference data set required for CellRanger Available on 10x Genomics download page | |
| --CRoptions | - | Alternative parameters for CellRanger | |
| --useSTARsolo | "no" | Use STARsolo branch ["no"/"yes"] | STARsolo |
| --STARwhitelist | - | Path to barcode whitelist for STARsolo | |
| --genome | - | Path to reference genome fasta file | STARsolo UMI-tools |
| --annotation | - | Path to reference genome annotation file | |
| --star | - | Path to executable STAR | |
| --STARoptions | - | Alternative parameters for STAR. See STAR manual | |
| --index | "no" | Genome index is already generated ["yes"/"no"] | |
| --indicesDir | "./indices" | Directory path to store genome indices If --index "yes", specify path with file prefix | |
| --read | "R2" | Read containing cDNA sequence ["R1"/"R2"] | |
| --barcode | "R1" | Read containing barcode and UMI ["R1"/"R2"] | |
| --useLanes | "all" | Sequencing lanes to use For example, use only lane 1,2,3: Enter "1,2,3" | |
| --genWhitelist | "yes" | Generate barcode whitelist ["no"/"yes"] | |
| ---umi-tools | - | Path to executable UMI-tools | |
| --useUMItools | "no" | Use the UMI-tools branch | |
| --samtools | - | Path to executable SAMtools | |
| --featureCounts | - | Path to executable featureCounts | |
| --UMITOOLSwhitelist | - | Path to barcode whitelist for UMI-tools branch | |
| --nGenes | 100 | Minimum number of expressed genes of a cell | |
| --nUMIs | 125 | Minimum number of UMI counts of a cell | |
| --MAD | 5 | nmads used scater's isOutlier function | |
| --thresholdMT | 1 | Maximal fraction of total UMI counts coming from MT genes | |
| --filterGenes | 0.001 | Remove sparsely expressed genes Fraction of cells a gene has to be expressed | |
| --normalize | "yes" | Log normalize gene-barcode matrix ["yes"/"no"] |