DOI: 10.1093/bioadv/vbag181 ISSN: 2635-0041

BioGraphX: Bridging the Sequence-Structure Gap via Physicochemical Graph Encoding for Interpretable Subcellular Localization Prediction

Abubakar Saeed, Waseem Abbas

Abstract

Motivation

Computational protein subcellular localization prediction is vital for understanding cellular mechanisms and disease treatments. However, current methods lack interpretability: they predict where a protein localizes but fail to explain why. Moreover, traditional approaches require costly, time-consuming three-dimensional structures.

Results

Here, we propose BioGraphX, a novel encoding framework that constructs protein interaction graphs directly from sequences using biochemical rules, providing a constraint-based structural proxy. Building upon this, BioGraphX-Net demonstrates superior performance on the DeepLoc 2.0 benchmark by integrating ESM-2 (Evolutionary Scale Modeling) embeddings with the proposed features via a gating mechanism. Gating analysis shows that while ESM-2 embeddings contribute strongly, BioGraphX features function as high-precision filters. SHAP (SHapley Additive exPlanations) analysis reveals feature importance patterns consistent with a sophisticated biophysical logic: sequence signals act as universal exclusion filters, while organelle-specific biophysical combinations enable precise compartment discrimination. Notably, Frustration features resolve targeting ambiguities in complex compartments, reflecting evolutionary constraints while preventing mislocalization from sequence mimicry. Cross-dataset validation on a protein solubility prediction task confirms the structural proxy captures genuine biophysical signal. Additionally, BioGraphX promotes Green AI in bioinformatics, matching state-of-the-art performance with a minimal parameter count of 13.46 million. In summary, BioGraphX provides accurate predictions and new insights into the language of life.

Availability and implementation

Source code is available at https://github.com/Abubakar-Saeed/BioGraphX.

More from our Archive