DOI: 10.1093/bioinformatics/btaf415 ISSN: 1367-4811

SOLeNNoID: A Deep Learning Pipeline For Solenoid Residue Detection in Protein Structures

Georgi I Nikov, Daniella Pretorius, James W Murray

Abstract

Motivation

Solenoid proteins, a subset of tandem repeat proteins, have structurally distinct, modular, and elongated architectures that differentiate them from globular proteins. These proteins play essential roles in diverse biological processes, including protein binding, enzymatic catalysis, ice binding, and nucleic acid interactions. Despite their biological significance and increasing commercial applications–such as in therapeutic engineered variants like DARPins and designed PPR proteins–accurate identification and annotation of solenoid structures remain challenging. Given that solenoid structures are more conserved than their sequences, recent advances in protein structure prediction suggest that structure-based solenoid detection methods are preferable to sequence-based ones.

Results

We introduce SOLeNNoID, a deep-learning-based pipeline for predicting solenoid residues in protein structures. Our method employs a convolutional neural network (CNN) architecture to analyze protein distance matrices, enabling accurate identification of solenoid-containing regions. SOLeNNoID covers all three solenoid subclasses: α-, α/β-, and β-solenoids. Comparative evaluation against existing structure-based methods demonstrates the superior performance of our approach. Applying SOLeNNoID to the entire Protein Data Bank (PDB) led to a 71% increase in detected solenoid-containing entries compared to the gold-standard RepeatsDB database, significantly expanding the known solenoid protein repertoire.

Availability and Implementation

SOLeNNoID is implemented in Python and available on github at https://github.com/gnik2018/SOLeNNoID. The source code and pre-trained models are accessible under a free-software license. Training data are available on Zenodo at https://zenodo.org/records/14927497. Contact: James W Murray j.w.murray@imperial.ac.uk

Supplementary information

Available online

More from our Archive