This repository contains workflow and analysis scripts pertaining to a simulation study that aims to assess the performance of several popular somatic variant calling methods.
The study was carried out using two different simulation approaches:
- Reads were simulated from an artificial reference genome.
- Germline and somatic variants were simulated and spiked into the reads.
- An empirical sequencing data set was split into multiple samples.
- Reads were sorted by germline allele (maternal/paternal allele) using germline variants.
- Somatic variants were simulated and spiked into the reads.
- R code to generate the plots for the publictation are in the scripts de-novo.R and spike-in.R
- Scripts that show how variant callers where run are in the respective directories under workflow.
- Technical details about the simulations can be gleaned from the content of the sim directories.