Skip to content

Mark duplicated reads on sorted bam #60

@Neo-xbx-00

Description

@Neo-xbx-00

Hi,
I am trying to use samblster to mark duplicated reads from my bam files, which have been sorted through following code:
bwa mem -t 48 Canu_chroms.fasta ../1_cleandata/${i}_clean_1.fq.gz ../1_cleandata/${i}_clean_2.fq.gz | samtools view -@ 48 -bS | samtools sort -@ 48 -o ${i}.sorted.bam
However, I found that many blogs as well as your tutorial suggesting that using samblaster during the aligning processes, like this:
(1) /home/line/local/app/bwa/bwa mem -t 40 -R "@rg\tID:${individual_id}\tSM:${individual_id}\tLB:lib" "$reference" "$input_1" "$input_2" | /home/line/local/app/samblaster-v.0.1.26/samblast
er --excludeDups --addMateTags --maxSplitCount 2 --minNonOverlap 20 | samtools view -S | samtools sort -@ 40 > ${id}.sorted.bam
Or, (2) bwa mem samp.r1.fq samp.r2.fq | samblaster | samtools view -Sb - > samp.out.bam
I wonder if there are some differences between these two methods: one marks duplicated reads from existing, sorted bam files, and another marks duplicated reads before sorting process.
Thanks~

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions