Chao Cao, Chunyu Wang, Shuhong Yang, Quan Zou

CircSI-SSL: circRNA-binding site identification based on self-supervised learning

  • Computational Mathematics
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Molecular Biology
  • Biochemistry
  • Statistics and Probability

Abstract Motivation In recent years, circular RNAs (circRNAs), the particular form of RNA with a closed-loop structure, has attracted widespread attention due to their physiological significance (they can directly bind proteins), leading to the development of numerous protein site identification algorithms. Unfortunately, these studies are supervised and require the vast majority of labeled samples in training to produce superior performance. But the acquisition of sample labels requires a large number of biological experiments and is difficult to obtain. Results To resolve this matter that a great deal of tags need to be trained in the circRNA binding site prediction task, a self-supervised learning binding site identification algorithm named CircSI-SSL is proposed in this paper. According to the survey, this is unprecedented in the research field. Specifically, CircSI-SSL initially combines multiple feature coding schemes and employs RNA_Transformer for cross-view sequence prediction (self-supervised task) to learn mutual information from the multi-view data, and then fine-tuning with only a few sample labels. Comprehensive experiments on 6 widely-used circRNA datasets indicate that our CircSI-SSL algorithm achieves excellent performance in comparison to previous algorithms, even in the extreme case where the ratio of training data to test data is 1:9. In addition, the transplantation experiment of 6 linRNA datasets without network modification and hyperparameter adjustment shows that CircSI-SSL has good scalability. In summary, the prediction algorithm based on self-supervised learning proposed in this paper is expected to replace previous supervised algorithms and has more extensive application value. Availability The source code and data are available at https://github.com/cc646201081/CircSI-SSL.

Need a simple solution for managing your BibTeX entries? Explore CiteDrive!

  • Web-based, modern reference management
  • Collaborate and share with fellow researchers
  • Integration with Overleaf
  • Comprehensive BibTeX/BibLaTeX support
  • Save articles and websites directly from your browser
  • Search for new articles from a database of tens of millions of references
Try out CiteDrive

More from our Archive