RNA language models predict mutations that improve RNA function

Shulgina, Yekaterina (ORCID:0000000176589294); Trinidad, Marena_I (ORCID:0000000178394642); Langeberg, Conner_J (ORCID:0000000256093758); Nisonoff, Hunter (ORCID:0000000313578111); Chithrananda, Seyone (ORCID:0000000236719135); Skopintsev, Petr (ORCID:000000026043157X); Nissley, Amos_J (ORCID:0000000348295373); Patel, Jaymin (ORCID:0000000243804805); Boger, Ron_S (ORCID:000000024467271X); Shi, Honglue (ORCID:0000000338471652); Yoon, Peter_H (ORCID:0000000291561393); Doherty, Erin_E (ORCID:0000000215554124); Pande, Tara (ORCID:0000000194404492); Iyer, Aditya_M; Doudna, Jennifer_A (ORCID:000000019161999X); Cate, Jamie_H_D (ORCID:0000000159657902)

doi:10.1038/s41467-024-54812-y

Citation Details

RNA language models predict mutations that improve RNA function

Abstract Structured RNA lies at the heart of many central biological processes, from gene expression to catalysis. RNA structure prediction is not yet possible due to a lack of high-quality reference data associated with organismal phenotypes that could inform RNA function. We present GARNET (Gtdb Acquired RNa with Environmental Temperatures), a new database for RNA structural and functional analysis anchored to the Genome Taxonomy Database (GTDB). GARNET links RNA sequences to experimental and predicted optimal growth temperatures of GTDB reference organisms. Using GARNET, we develop sequence- and structure-aware RNA generative models, with overlapping triplet tokenization providing optimal encoding for a GPT-like model. Leveraging hyperthermophilic RNAs in GARNET and these RNA generative models, we identify mutations in ribosomal RNA that confer increased thermostability to theEscherichia coliribosome. The GTDB-derived data and deep learning models presented here provide a foundation for understanding the connections between RNA sequence, structure, and function. more »

Award ID(s):: 2002182

PAR ID:: 10558790

Author(s) / Creator(s):: Shulgina, Yekaterina; Trinidad, Marena_I; Langeberg, Conner_J; Nisonoff, Hunter; Chithrananda, Seyone; Skopintsev, Petr; Nissley, Amos_J; Patel, Jaymin; Boger, Ron_S; Shi, Honglue; Yoon, Peter_H; Doherty, Erin_E; Pande, Tara; Iyer, Aditya_M; Doudna, Jennifer_A; Cate, Jamie_H_D

Publisher / Repository:: Nature Publishing Group

Date Published:: 2024-12-05

Journal Name:: Nature Communications

Volume:: 15

Issue:: 1

ISSN:: 2041-1723

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1038/s41467-024-54812-y

More Like this