DOI: 10.3390/ijms27135848 ISSN: 1422-0067

MD-Transformer: Multimodal Integration of ProtBERT Embeddings and Physicochemical Descriptors for Protein–Protein Interface Residue Prediction

Jiahui Yang, Jihua Feng, Yuting Zhang, Zhongxing Chen

Accurate prediction of protein–protein interaction (PPI) interface residues is essential for understanding molecular recognition and supporting structure-guided design. To integrate contextual sequence representations with structure-related physicochemical information, we propose a multimodal framework termed MD-Transformer. The model combines residue-level ProtBERT embeddings with physicochemical descriptors, including B-factor, solvent-accessible surface area (SASA), and hydrophobicity. A hybrid fusion module first aligns heterogeneous features, followed by Transformer encoding and cross-modal attention for multimodal integration. Using the DB5.5 benchmark, physicochemical descriptors were Z-score normalized exclusively with training-set statistics. Under the complex-level split protocol (Official A), MD-Transformer achieved an AUPRC of 0.564, outperforming the ablation model without physicochemical descriptors by 0.159 and reducing false-positive predictions on exposed non-interface residues. Under the homology-aware split protocol (Official B v1), the model maintained an AUPRC of 0.480 and an MCC of 0.242, indicating retained predictive capability under reduced sequence similarity constraints. Under the same aligned evaluation workflow, PeSTo achieved an AUPRC of 0.264. Further SASA-stratified analyses identified SASA as a major contributor to suppressing false-positive predictions across residue exposure environments, while also revealing a precision-recall trade-off in highly exposed residues. These results suggest that contextual sequence representations and residue-level physicochemical descriptors provide complementary predictive signals.

More from our Archive