Skip to content

Does bam file need to be deduplicated? #3

@enluo211

Description

@enluo211

First, Fastp/0.23.0 was used for quality control of the original sequencing data,Next, the B73v4 genome is used as the reference genome, BWA/0.7.17 is used for alignment, and finally Picard/2.1.1 is used to modify the header of the bam file.
the script just like this:
###!/bin/sh
prx=$1
module load fastp/0.23.0
module load BWA/0.7.17
module load picard/2.1.1-Java-1.8.0_92
module load Java/1.8.0_92
dir1=/public/home/eluo/sswu
dir2=/public/home/eluo/sswu/TEO
genome=/public/home/eluo/luoen/Zea_mays.AGPv4.dna.toplevel.fa
cd $dir2
##过滤
fastp -g -w 20 -l 150 -i ${dir1}/$prx.R1.fq.gz -I ${dir1}/$prx.R2.fq.gz -o ${dir2}/$prx.1.trimed.fq.gz -O ${dir2}/$prx.2.trimed.fq.gz -h ${dir2}/${prx}.html
##比对
bwa mem -t 20 $genome $dir2/$prx.1.trimed.fq.gz $dir2/$prx.2.trimed.fq.gz | samtools sort -@20 -o $dir2/$prx.sorted.bam

java -Xmx30g -jar ${EBROOTPICARD}/picard.jar AddOrReplaceReadGroups I=$dir2/${prx}.sorted.bam O=$dir2/$prx.addrg.sort.bam RGID=$prx RGPU=unkn-0.0 RGLB=lib$prx RGSM=$prx RGPL=ILLUMINA
samtools index $dir2/$prx.addrg.sort.bam

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions