Skip to content

An analysis pipeline for long-reads from both PacBio and Oxford Nanopore Technologies (ONT), written in Nextflow.

License

Notifications You must be signed in to change notification settings

eliottBo/nallo

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Open in GitHub Codespaces GitHub Actions CI Status GitHub Actions Linting Status Cite with Zenodo nf-test Nextflow nf-core template version run with docker run with singularity Launch on Seqera Platform

Nallo logo

Introduction

genomic-medicine-sweden/nallo is a bioinformatics analysis pipeline for long-reads from both PacBio and (targeted) ONT-data, focused on rare-disease. Heavily influenced by best-practice pipelines such as nf-core/sarek, nf-core/raredisease, nf-core/nanoseq, PacBio Human WGS Workflow, epi2me-labs/wf-human-variation and brentp/rare-disease-wf.

genomic-medicine-sweden/nallo workflow
QC
Alignment & assembly
  • Assemble genomes with hifiasm
  • Align reads and assemblies to reference with minimap2
Variant calling
Phasing and methylation
Annotation
Ranking
  • Rank SNVs, INDELs, SVs and CNVs with GENMOD
Filtering

Usage

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

Prepare a samplesheet with input data:

samplesheet.csv

project,sample,file,family_id,paternal_id,maternal_id,sex,phenotype
 my_project,HG002,/path/to/HG002.fastq.gz,NIST,HG003,HG004,1,2
 my_project,HG003,/path/to/HG003.bam,NIST,0,0,1,1
 my_project,HG004,/path/to/HG004.bam,NIST,0,0,2,1

Supply a reference genome with --fasta and choose a matching --preset for your data (revio, pacbio, ONT_R10). Now, you can run the pipeline using:

nextflow run genomic-medicine-sweden/nallo \
    -profile <docker/singularity/.../institute> \
    --input samplesheet.csv \
    --preset <revio/pacbio/ONT_R10> \
    --fasta <reference.fasta> \
    --outdir <OUTDIR>

However, to run most parts of the pipeline you will need to supply additional reference files. For more details and further functionality, please refer to the documentation.

Credits

genomic-medicine-sweden/nallo was originally written by Felix Lenner.

We thank the following people for their extensive assistance in the development of this pipeline: Anders Jemt, Annick Renevey, Daniel Schmitz, Lucía Peña-Pérez, Peter Pruisscher & Ramprasad Neethiraj.

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

Citations

If you use genomic-medicine-sweden/nallo for your analysis, please cite it using the following doi: 10.5281/zenodo.13748210.

This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

An extensive list of references for the tools used by the pipeline can be found in the docs/CITATIONS.md file.

About

An analysis pipeline for long-reads from both PacBio and Oxford Nanopore Technologies (ONT), written in Nextflow.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Nextflow 96.9%
  • Python 2.6%
  • Other 0.5%