-
Notifications
You must be signed in to change notification settings - Fork 33
Description
Hi,
I am trying to use samblster to mark duplicated reads from my bam files, which have been sorted through following code:
bwa mem -t 48 Canu_chroms.fasta ../1_cleandata/${i}_clean_1.fq.gz ../1_cleandata/${i}_clean_2.fq.gz | samtools view -@ 48 -bS | samtools sort -@ 48 -o ${i}.sorted.bam
However, I found that many blogs as well as your tutorial suggesting that using samblaster during the aligning processes, like this:
(1) /home/line/local/app/bwa/bwa mem -t 40 -R "@rg\tID:${individual_id}\tSM:${individual_id}\tLB:lib" "$reference" "$input_1" "$input_2" | /home/line/local/app/samblaster-v.0.1.26/samblast
er --excludeDups --addMateTags --maxSplitCount 2 --minNonOverlap 20 | samtools view -S | samtools sort -@ 40 > ${id}.sorted.bam
Or, (2) bwa mem samp.r1.fq samp.r2.fq | samblaster | samtools view -Sb - > samp.out.bam
I wonder if there are some differences between these two methods: one marks duplicated reads from existing, sorted bam files, and another marks duplicated reads before sorting process.
Thanks~