DOI: 10.1093/bioinformatics/btag446 ISSN: 1367-4811

Inferring Dynamic Information from Protein Structures by Gaussian Integrals and Deep Learning

Felipe Vilicich, Nicolás Bottino, Zhaoqian Su, Shanye Yin, Yinghao Wu

Abstract

Motivation

Protein dynamics are central to function, but experiments and molecular dynamics (MD) simulations remain costly, low-throughput, and difficult to compare across protocols. Scalable structure-based methods are needed to infer dynamics from static protein structures.

Results

We present a deep learning framework that predicts protein dynamics from 30-dimensional Gaussian integral (GI) descriptors of Cα backbone topology. Using 1,374 ATLAS protein chains with MD-derived RMSF, GI stratified proteins into fold-relevant clusters enriched for secondary structure, sequence homology, and ECOD families. An attention-based 1D-CNN classified flexible versus non-flexible proteins with test AUC = 0.772 and separated slow-mode– from fast-mode–dominated dynamics with AUC = 0.91. Regression models recovered mean RMSF (Pearson r = 0.72; R² = 0.46) and slow-mode RMSF more accurately (Pearson r = 0.83; R² = 0.62), supporting rapid inference of flexibility and collective-motion bias.

Availability and implementation

Code and data are available on GitHub at: https://github.com/fvilicich/gaussian_integral/blob/main/gaussian_integral_classification.ipynb.

Supplementary information

Supplementary data are available at Bioinformatics online.

More from our Archive