DeepEpi: Deep Learning Model for Predicting Gene Expression Regulation based on Epigenetic Histone Modifications
Rania Hamdy, Yasser M.K. Omar, Fahima A. Maghraby- Computational Mathematics
- Genetics
- Molecular Biology
- Biochemistry
Background:
Histone modification is a vital element in gene expression regulation. The way in which these proteins bind to the DNA impacts whether or not a gene may be expressed. Although those factors cannot in-fluence DNA construction, they can influence how it is transcribed.
Objective:
Each spatial location in DNA has its function, so the spatial ar-rangement of chromatin modifications affects how the gene can express. Al-so, gene regulation is affected by the type of histone modification combina-tions that are present on the gene and depends on the spatial distributional pattern of these modifications and how long these modifications read on a gene region. So, this study aims to know how to model Long-range spatial genome data and model complex dependencies among Histone reads.
Methods:
The Convolution Neural Network (CNN) is used to model all da-ta features in this paper. It can detect patterns in histones signals and pre-serve the spatial information of these patterns. It also uses the concept of memory in long short-term memory (LSTM), using vanilla LSTM, Bi-Directional LSTM, or Stacked LSTM to preserve long-range histones sig-nals. Additionally, it tries to combine these methods using ConvLSTM or uses them together with the aid of a self-attention.
Results:
Based on the results, the combination of CNN, LSTM with the self-attention mechanism obtained an Area under the Curve (AUC) score of 88.87% over 56 cell types.
Conclusion:
The result outperforms the present state-of-the-art model and provides insight into how combinatorial interactions between histone modi-fication marks can control gene expression. The source code is available at https://github.com/RaniaHamdy/DeepEpi.