iMFP-LG: Identification of Novel Multi-Functional Peptides by Using Protein Language Models and Graph-Based Deep Learning
Jiawei Luo, Kejuan Zhao, Junjie Chen, Caihua Yang, Fuchuan Qu, Yumeng Liu, Xiaopeng Jin, Ke Yan, Yang Zhang, Bin LiuAbstract
Functional peptides are short amino acid fragments that have a wide range of beneficial functions for living organisms. The majority of previous research focused on mono-functional peptides, but a growing number of multi-functional peptides have been discovered. Although there have been enormous experimental efforts to assay multi-functional peptides, only a small fraction of millions of known peptides have been explored. Effective and precise techniques for identifying multi-functional peptides can facilitate their discovery and mechanistic understanding. In this article, we presented a method iMFP-LG for identifying multi-functional peptides based on protein language models (pLMs) and graph attention networks (GATs). Comparison results showed that iMFP-LG outperforms state-of-the-art methods on both multi-functional bioactive peptides and multi-functional therapeutic peptides datasets. The interpretability of iMFP-LG was also illustrated by visualizing attention patterns in pLMs and GATs. Regarding the outstanding performance of iMFP-LG on the identification of multi-functional peptides, we employed iMFP-LG to screen novel candidate peptides with both ACP and AMP functions from millions of known peptides in the UniRef90. As a result, 8 candidate peptides were identified, and 1 candidate that exhibits both antibacterial and anticancer effects was confirmed through molecular structure alignment and biological experiments. We anticipate that iMFP-LG can assist in the discovery of multi-functional peptides and contribute to the advancement of peptide drug design.