DOI: 10.1093/bioadv/vbae198 ISSN: 2635-0041

Energy Metric Prediction for Double Insertion Mutants via the RoseNet Deep Learning Framework

Sarah Coffland, Katie Christensen, Brian Hutchinson, Filip Jagodzinski

Abstract

Studying the structural and functional implications of protein mutations is an important task in computational biology and bioinformatics. We leverage our previously proposed RoseNet neural network architecture to predict energy metrics of proteins with double amino acid insertions or deletions (InDels). We train models on previously generated benchmark datasets containing the exhaustive double InDel mutations for three proteins, as well as an additional three proteins for which ∼145k random mutants, each with two InDels, have been generated. We expand on our previous work by evaluating three additional proteins and analyzing domain features that impact the prediction capabilities of RoseNet. These features include InDels into secondary structures and the solvent accessible surface area (SASA) scores of the residues. We uncover further evidence to support that RoseNet has a higher proficiency of generalizing to unseen residue combinations than unseen insertion positions. We also observe that RoseNet produces higher quality predictions when inserting into a β-sheet over an α-helix. Additionally, when the insertions fall in an area of high SASA, RoseNet often displays better performance than inserting into areas of low SASA.

More from our Archive