You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- If you are on a server, ask the sys admin to install it. Sometimes there are weird permission issue if you install on your own.
12
+
- Not recommended: [install via conda](https://anaconda.org/conda-forge/singularity)
10
13
- Download this repository by `git clone https://github.com/YeoLab/Mudskipper.git`.
11
-
- Download depending repository and modify config variables as follow: # TODO: containerize or make to snakemake hub
12
-
- Yeolab internal users don't need to.
13
-
- Install skipper dependecies and modify the following config variables:`JAVA_PATH`,`UMICOLLAPSE_PATH`, `R_EXE`. # TODO: containerize
14
-
- follow [skipper instructions](https://github.com/YeoLab/skipper#prerequisites) to set up
15
-
- Most dependencies are already specified in `rules/envs`. When running snakemake, using `--use-conda` should automatically install everything for you.
16
-
17
-
18
-
# How to run.
19
-
1. prepare `PATH_TO_YOUR_CONFIG`. See below and `config/preprocess_config/oligope_iter5.yaml`
- `fastq1`&`fastq2`: *.fastq.gz file for read1 and read 2
62
60
- `libname`: unique names for each library. Should not contain space, special characters such as #,%,*
63
61
- `experiment`: unique names for experiment. **Rows with the same `experiment` will be treated as replicates.** Should not contain space, special characters such as #,%,*
64
-
### `barcode_csv`: specifying barcode sequencing per Antibody/RBP
62
+
63
+
## `barcode_csv`: specifying barcode sequencing per Antibody/RBP
65
64
- Example: `config/barcode_csv/iter5.csv`
66
65
- Notebook to generate this file (Yeolab internal user): `utils/generate barcode-iter5.ipynb`
- 2nd column: Antibody/RBP name, Should not contain space, special characters such as #,%,*.
73
72
74
-
### Outputs
73
+
# Options to Control Output
75
74
- `WORKDIR`: output directory
76
75
- `RBP_TO_RUN_MOTIF`: list of RBP names to run motif analysis. Must be one of the rows in `barcode_csv`.
77
76
- `run_clipper`: True if you want CLIPper outputs (works, but slow)
78
77
- `run_skipper`: True if you want to run Skipper. (usually doesn't work in ABC)
79
78
- `run_comparison`: True if you want to run Piranha
80
79
- debug: True if you want to debug. This tries to blast the unmapped reads.
81
80
82
-
### Choosing backgrounds
81
+
# Options to Choose Backgrounds
83
82
By default if the below are left blank, we run Dirichlet Multinomial Mixture(DMM) for multiplex datasets, where RBPs are explicitly compared with each other. DMM is the best model for multiplex dataset.
84
83
85
84
Unfortunately, DMM doesn't work for singleplex. Calling singleplex binding sites require "external control" (see below). Otherwise it will just stop at the read counting stage.
86
85
87
86
But if you want to add an background library, here is how to do:
88
-
#### "Internal control": a barcode that measures the background. They are in the same `fastq.gz`
87
+
88
+
## "Internal control": a barcode that measures the background. They are in the same `fastq.gz`
89
89
- `AS_INPUT`: if you have a IgG antibody that everything will normalize against, type its name here. Must be one of the rows in `barcode_csv`. This can the background for skipper, CLIPper, and beta-binomial mixture model
90
-
#### "External control": a library that is NOT in the same fastq as your oligoCLIP/ABC
90
+
91
+
## "External control": a library that is NOT in the same fastq as your oligoCLIP/ABC
91
92
- specify them in `external_bam` with name of the library (first line, ex `oligoCLIP_ctrlBead_rep2`), followed by `file:` and `INFORMATIVE_READ`
- This can be an eCLIP SMInput, total RNA-seq, IgG pull down from another experiment, bead control, spike-ins
99
100
- these will also be used as a background in skipper, CLIPper and beta-binomial mixture model
100
101
- the bams must be processed with the exact same STAR index as `STAR_DIR`, and is recommended to be processed with the same/similar mapping parameters as this repo or skipper.
101
102
102
103
103
-
104
-
## Dependencies:
105
-
-`SCRIPT_PATH`: Absolute path to `scripts` folder.
106
-
-`JAVA_PATH`,`UMICOLLAPSE_PATH`, `R_EXE`: skipper dependencies. See `Installation`.
107
-
108
-
## Preprocessing options:
104
+
# Preprocessing Options:
109
105
- `adaptor_fwd`,`adaptor_rev`: adapter sequence to trim. Do not include barcode
110
106
- `tile_length`: we tile adapter sequences of this length so that indels don't mess up with trimming
111
107
- `QUALITY_CUTOFF`: default 15. cutadapt params
112
108
- `umi_length`: Length of unique molecular identifier (UMI).
113
109
- `STAR_DIR`: directory to STAR index
114
110
115
-
## Annotations:
111
+
# Annotation Options:
116
112
- skipper annotations: [follow skipper instructions](https://github.com/YeoLab/skipper#prerequisites) or generate with [skipper_utils](https://github.com/algaebrown/skipper_utils)
117
113
- Yeolab internal users: Brian had all sorts of annotations here `/projects/ps-yeolab4/software/skipper/1.0.0/bin/skipper/annotations/`.
118
114
- `CHROM_SIZES`
119
115
- `GENOMEFA`
120
116
121
-
# Output Files
117
+
# Output files
122
118
## Trimmed fastqs, bams, bigwigs:
123
119
These are in the `EXPERIMENT_NAME` folders. For example, in your manifest.csv, there are two experiments, "GN_1019" and "GN_1020", then, under the `GN_1019/` folder you would see the following:
124
120
1. `fastqs`: The trimmed and the demultiplexed fastqs.
0 commit comments