Skip to content

Identification of methylation for Nanopore DNA sequencing

Notifications You must be signed in to change notification settings

Bessyyi/Poreformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Poreformer

Identification of methylation for Nanopore DNA sequencing.

Poreformer is a computational tool for detecting DNA 5mC、4mC and 6mA methylation from Oxford Nanopore reads. It uses a Transfomer model to predict per-read and per-site 5mC、4mC and 6mA methylations and produces a methylation file. Poreformer can call methylation from FAST5 files basecalled with Guppy and provides models for R9.4.1 flowcells.

Installation

Please refer to Installation for how to install Poreformer.

Inference

Quick usage guide for model inference:

  1. To call modifications, the raw fast5 files should be basecalled by Guppy.
ont-guppy/bin/guppy_basecaller -i ${INPUT_DIR}/BA_NAT -s ${INPUT_DIR}/BA_NAT_guppy -c ont-guppy/data/dna_r9.4.1_450bps_hac.cfg -x cuda:all:100% -r --fast5_out
  1. Extract fastq and signal information from fast5 file and Align reads using minimap2 and then sort and index the BAM file
./align_index.sh -ref Bacillus_amyloliquefaciens.fa -fast5 ${INPUT_DIR}/BA_NAT_guppy/.fast5 -ref_rev Bacillus_amyloliquefaciens_rev.fa
  1. Extract features
sh extra_feature.sh -ref Bacillus_amyloliquefaciens.fa -forward_current all_zheng_fin_sort_fin.txt -ref_rev Bacillus_amyloliquefaciens_rev.fa -reversed_current all_fan_fin_sort_fin.txt

4. Methylation Calling with Poreformer

sh Poreformer.sh -meth_type mC -feature all_mC_reversed_current_mean_kmeans_6_5.txt
sh Poreformer.sh -meth_type 6mA -feature all_6mA_reversed_current_mean_kmeans_7_6.txt

Please refer to Usage.md for details on how to use Poreformer.

About

Identification of methylation for Nanopore DNA sequencing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published