Protein-Nucleic Acid Binding Site Prediction Using Interpretable Kolmogorov–Arnold Networks with Hypergraph Representation Learning

doi:10.1093/bioinformatics/btag433

DOI: 10.1093/bioinformatics/btag433 ISSN: 1367-4811

Protein-Nucleic Acid Binding Site Prediction Using Interpretable Kolmogorov–Arnold Networks with Hypergraph Representation Learning

Yangfeng Zhu, Guicong Sun, Weimin Zhu, Yongxian Fan, Zeheng Wu, Xianchen Zheng, Xiaoyong Pan

Show PDF Cite

Abstract

Motivation

In recent years, protein language models (pLMs) and graph neural networks (GNNs) have demonstrated powerful expressive and reasoning capabilities in modeling protein-RNA/DNA interactions. However, existing methods, which use simple graphs to describe the relationships between residues, struggle to effectively capture the high-order, multi-body residue interactions present in protein-nucleic acid complex structures. In fact, spatially continuous but sequence-wise discontinuous residues often cooperatively determine nucleic acid binding capacity.

Results

In this study, we present IKANbind, a computational approach that combines hypergraph representation learning and interpretable Kolmogorov–Arnold Networks (KANs), for identifying nucleic acid binding residues (NBRs) in proteins. By combining the advantages of pLM, hypergraph neural networks and symbolic KAN, IKANbind outperforms existing methods on multiple NBR benchmark datasets. We also demonstrated that the pLM used in IKANbind can implicitly learn the physicochemical properties of binding residues, such as charge and hydrophobicity. In addition, the symbolic KAN, which uses a unique weighted mechanism of decomposable basis functions, can accurately identify the features with the greatest contribution to NBR recognition. We found that polarity and charge make greater contributions to NBR prediction than other physicochemical properties or evolutionary information. Finally, IKANbind achieves promising performance when extended to other ligand-binding residue prediction tasks.

Outline

Protein-Nucleic Acid Binding Site Prediction Using Interpretable Kolmogorov–Arnold Networks with Hypergraph Representation Learning

Abstract

Motivation

Results

Availability

Supplementary information

More from our Archive