Enhancing Proinflammatory Peptide Prediction by Integrating a Transformer Encoder and a Protein Language Model Based on a Residual Module
Shengli Zhang, Bin ShengIntroduction:
Existing Proinflammatory peptides (PIPs) prediction methods mainly rely on handcrafted features and show limitations in capturing complex sequence features. To address this issue, we propose Trans-Prot, a dual-branch model integrating Trans-former and ProtT5 for PIP prediction. PIPs play important roles in immune regulation, but existing prediction methods, such as MultiFeatVotPIP, mainly rely on handcrafted features, with limited capability in capturing complex sequence semantics and long-range dependencies, resulting in restricted prediction accuracy.
Methods:
A dual-branch deep learning framework is employed for PIP prediction. Amino acid sequences are encoded by a Transformer encoder combined with positional embeddings and convolutional layers to extract global and local sequence features. Meanwhile, the ProtT5 protein language model, combined with Bi-GRU and convolutional layers, is used to capture semantic information and sequence dependencies. Subsequently, a residual fusion module is applied to integrate features from the two branches while enhancing feature representation capability. Finally, a multilayer perceptron is utilized for PIP classification and prediction.
Results:
On the independent test set, Trans-Prot achieves an accuracy (ACC) of 69.5%, sensitivity (SN) of 65.5%, Matthews correlation coefficient (MCC) of 0.393, and area under the ROC curve (auROC) of 0.756, with improvements of 4%, 13.4%, 0.071, and 0.070, respectively.
Discussion:
Existing PIP prediction methods have limited ability to model complex sequence information, while combining protein language models with Transformer architectures enables more effective sequence characterization.
Conclusion:
Trans-Prot improves PIP prediction performance and provides an effective computational framework for peptide-related bioinformatics research.