DeepASMM: Decoding Cis-Regulatory Mechanisms via Quantification of Motif Autonomy and Contextual Synergy

DeepASMM

DeepASMM (Deep Learning Driven Autonomous and Synergic Motif Mining Framework) is a neural network–based approach that quantifies both autonomous functionality and sequence context synergy of motifs through forward-propagation perturbation analysis.

For additional details, we kindly invite you to refer to the DeepASMM publication: DeepASMM: Decoding Cis-Regulatory Mechanisms via Quantification of Motif Autonomy and Contextual Synergy.

Workflow of DeepASMM

We developed a deep learning–based framework called DeepASMM to identify motifs with autonomous effects and sequence context synergy in genomic sequences.

DeepASMM consists of four main steps. First, deep learning–based genomic sequence prediction models are constructed to capture regulatory information embedded in sequences. One-hot encoding is used to preprocess sequences, enabling the model to effectively learn predictive rules. Second, background sequences are selected from true positive samples, ensuring that the sequences used for motif discovery contain real regulatory signals. Third, candidate motif localization is performed by scanning background sequences to identify all occurrences of each motif. Fourth, motif functionality assessment is conducted: the motif autonomous functionality score quantifies the intrinsic regulatory effect of a motif by embedding it into empty sequences, while the sequence context synergy score measures how the surrounding sequence context influences the motif’s effect.

Experimental Data Introduction

In this study, we used the dataset of our previously developed maize gene expression prediction model DeepCBA for the experiments (Wang et al., 2024). This dataset includes chromatin interaction and gene expression data of three tissues (shoot, ear, and tassel) of maize (B73).

The maize chromatin accessibility prediction task, also involves the data of three tissues: shoot, ear, and tassel (Peng et al., 2019; Li et al., 2019; Sun et al., 2020). For each dataset of chromatin accessibility peaks, we extended to 300bp region based on the central locus as positive samples. Negative samples were randomly selected from the maize B73 reference genome with the same number as positive samples, ensuring no overlap with positive regions. All samples were randomly split into training, validation, and test sets at a ratio of 6:2:2.

For the human chromatin accessibility prediction task, we used the dataset reported in the Basset model (Kelley et al., 2016). This dataset contains 2,071,886 sequences of 600bp covering 164 human cell types. In this dataset, 1,930,000 sequences were randomly selected as the training set, 70,000 as the validation set, and 71,886 as the test set.

Environment

CUDA Environment

If you are running this project using GPU, please configure CUDA and cuDNN according to this version.

	Version
CUDA	11.8
cuDNN	8.6

package Environment

This project is based on Python 3.8.13. The required environment is as follows:

Packages	Version
numpy	1.19.5
pandas	1.2.4
tensorflow	2.4.0
logomaker	0.8
matplotlib	3.4.3
tqdm	4.62.3

Some test cases have also been verified to run on tensorflow 1.15.

For more required packages, please refer to the requirements.txt file in this project.

How to Run

For detailed instructions, please refer to the DeepASMM Manual in this repository.
Parallel execution: DeepASMM supports Python-based multi-processing acceleration. Depending on your hardware configuration, up to 10× speedup can be achieved.
Demo examples are available at DeepASMM Demos.
If multiple motif sequence alignments are required, we recommend using our extended tool, TOMTOM Parallelization Tool.
This tool is built on Python’s multi-processing framework and wraps the MEME Suite TOMTOM module, achieving up to 100× faster alignment performance.

Contact

For questions or suggestions, please reach out: Li_jie@webmail.hzau.edu.cn, liujianxiao@mail.hzau.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
DeepASMM		DeepASMM
demos		demos
imgs		imgs
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepASMM: Decoding Cis-Regulatory Mechanisms via Quantification of Motif Autonomy and Contextual Synergy

DeepASMM

Workflow of DeepASMM

Experimental Data Introduction

Environment

CUDA Environment

package Environment

How to Run

Contact

About

Uh oh!

Releases

Packages

Languages

License

Jie-Lii/DeepASMM

Folders and files

Latest commit

History

Repository files navigation

DeepASMM: Decoding Cis-Regulatory Mechanisms via Quantification of Motif Autonomy and Contextual Synergy

DeepASMM

Workflow of DeepASMM

Experimental Data Introduction

Environment

CUDA Environment

package Environment

How to Run

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages