Mark duplicated reads on sorted bam

Hi,
   I am trying to use samblster to mark duplicated reads from my bam files, which have been sorted through following code:
   bwa mem -t 48 Canu_chroms.fasta ../1_cleandata/${i}_clean_1.fq.gz ../1_cleandata/${i}_clean_2.fq.gz | samtools view -@ 48 -bS | samtools sort -@ 48 -o ${i}.sorted.bam
   However, I found that many blogs as well as your tutorial suggesting that using samblaster during the aligning processes, like this:
   (1) /home/line/local/app/bwa/bwa mem -t 40 -R "@RG\tID:${individual_id}\tSM:${individual_id}\tLB:lib" "$reference" "$input_1" "$input_2" | /home/line/local/app/samblaster-v.0.1.26/samblast
er --excludeDups --addMateTags --maxSplitCount 2 --minNonOverlap 20 | samtools view -S | samtools sort  -@ 40 > ${id}.sorted.bam
   Or, (2) bwa mem <idxbase> samp.r1.fq samp.r2.fq | samblaster | samtools view -Sb - > samp.out.bam
   I wonder if there are some differences between these two methods: one marks duplicated reads from existing, sorted bam files, and another marks duplicated reads before sorting process.
   Thanks~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mark duplicated reads on sorted bam #60

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Mark duplicated reads on sorted bam #60

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions