Variant Effect Prediction in the Age of Machine Learning

Bromberg, Yana; Prabakaran, R; Kabir, Anowarul; Shehu, Amarda

doi:10.1101/cshperspect.a041467

Citation Details

Variant Effect Prediction in the Age of Machine Learning

Over the years, many computational methods have been created for the analysis of the impact of single amino acid substitutions resulting from single-nucleotide variants in genome coding regions. Historically, all methods have been supervised and thus limited by the inadequate sizes of experimentally curated data sets and by the lack of a standardized definition of variant effect. The emergence of unsupervised, deep learning (DL)-based methods raised an important question: Canmachines learn the language of life fromthe unannotated protein sequence data well enough to identify significant errors in the protein “sentences”? Our analysis suggests that some unsupervised methods perform as well or better than existing supervised methods. Unsupervised methods are also faster and can, thus, be useful in large-scale variant evaluations. For all other methods, however, their performance varies by both evaluation metrics and by the type of variant effect being predicted.We also note that the evaluation of method performance is still lacking on less-studied, nonhuman proteins where unsupervised methods hold the most promise. more »

Award ID(s):: 2318829 2310114 2310113

PAR ID:: 10523474

Author(s) / Creator(s):: Bromberg, Yana; Prabakaran, R; Kabir, Anowarul; Shehu, Amarda

Publisher / Repository:: Cold Spring Harbor: Perspectives in Biology

Date Published:: 2024-07-01

Journal Name:: Cold Spring Harbor Perspectives in Biology

Volume:: 16

Issue:: 7

ISSN:: 1943-0264

Page Range / eLocation ID:: a041467

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1101/cshperspect.a041467

More Like this