Inferring Dynamic Information from Protein Structures by Gaussian Integrals and Deep Learning
Felipe Vilicich, Nicolás Bottino, Zhaoqian Su, Shanye Yin, Yinghao WuAbstract
Motivation
Protein dynamics are central to function, but experiments and molecular dynamics (MD) simulations remain costly, low-throughput, and difficult to compare across protocols. Scalable structure-based methods are needed to infer dynamics from static protein structures.
Results
We present a deep learning framework that predicts protein dynamics from 30-dimensional Gaussian integral (GI) descriptors of Cα backbone topology. Using 1,374 ATLAS protein chains with MD-derived RMSF, GI stratified proteins into fold-relevant clusters enriched for secondary structure, sequence homology, and ECOD families. An attention-based 1D-CNN classified flexible versus non-flexible proteins with test AUC = 0.772 and separated slow-mode– from fast-mode–dominated dynamics with AUC = 0.91. Regression models recovered mean RMSF (Pearson r = 0.72; R² = 0.46) and slow-mode RMSF more accurately (Pearson r = 0.83; R² = 0.62), supporting rapid inference of flexibility and collective-motion bias.
Availability and implementation
Code and data are available on GitHub at: https://github.com/fvilicich/gaussian_integral/blob/main/gaussian_integral_classification.ipynb.
Supplementary information
Supplementary data are available at Bioinformatics online.