skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 8:00 PM ET on Friday, March 21 until 8:00 AM ET on Saturday, March 22 due to maintenance. We apologize for the inconvenience.


Title: Distance Profiles of Optimal RNA Foldings
Predicting the secondary structure of RNA is an important problem in molecular biology, providing insights into the function of non-coding Rn As and with broad applications in understanding disease, the development of new drugs, among others. Combinatorial algorithms for predicting RNA foldings can generate an exponentially large number of equally optimal foldings with respect to a given optimization criterion, making it difficult to determine how well any single folding represents the entire space. We provide efficient new algorithms for providing insights into this large space of optimal RNA foldings and a research software tool, toRNAdo, that implements these algorithms.  more » « less
Award ID(s):
2231150
PAR ID:
10436135
Author(s) / Creator(s):
; ; ;
Editor(s):
Bansal, M
Date Published:
Journal Name:
Bioinformatics Research and Applications: 18th International Symposium, ISBRA 2022, Haifa, Israel, November 14–17, 2022, Proceedings
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Discoveries of RNA roles in cellular physiology and pathology are increasing the need for new tools that modulate the structure and function of these biomolecules, and small molecules are proving useful. In 2017, we curated the RNA-targeted BIoactive ligaNd Database (R-BIND) and discovered distinguishing physicochemical properties of RNA-targeting ligands, leading us to propose the existence of an “RNA-privileged” chemical space. Biennial updates of the database and the establishment of a website platform (rbind.chem.duke.edu) have provided new insights and tools to design small molecules based on the analyzed physicochemical and spatial properties. In this report and R-BIND 2.0 update, we refined the curation approach and ligand classification system as well as conducted analyses of RNA structure elements for the first time to identify new targeting strategies. Specifically, we curated and analyzed RNA target structural motifs to determine the properties of small molecules that may confer selectivity for distinct RNA secondary and tertiary structures. Additionally, we collected sequences of target structures and incorporated an RNA structure search algorithm into the website that outputs small molecules targeting similar motifs without a priori secondary structure knowledge. Cheminformatic analyses revealed that, despite the 50% increase in small molecule library size, the distinguishing properties of R-BIND ligands remained significantly different from that of proteins and are therefore still relevant to RNA-targeted probe discovery. Combined, we expect these novel insights and website features to enable the rational design of RNA-targeted ligands and to serve as a resource and inspiration for a variety of scientists interested in RNA targeting. 
    more » « less
  2. Abstract

    Structured RNA lies at the heart of many central biological processes, from gene expression to catalysis. RNA structure prediction is not yet possible due to a lack of high-quality reference data associated with organismal phenotypes that could inform RNA function. We present GARNET (Gtdb Acquired RNa with Environmental Temperatures), a new database for RNA structural and functional analysis anchored to the Genome Taxonomy Database (GTDB). GARNET links RNA sequences to experimental and predicted optimal growth temperatures of GTDB reference organisms. Using GARNET, we develop sequence- and structure-aware RNA generative models, with overlapping triplet tokenization providing optimal encoding for a GPT-like model. Leveraging hyperthermophilic RNAs in GARNET and these RNA generative models, we identify mutations in ribosomal RNA that confer increased thermostability to theEscherichia coliribosome. The GTDB-derived data and deep learning models presented here provide a foundation for understanding the connections between RNA sequence, structure, and function.

     
    more » « less
  3. RNA is critical to a broad spectrum of biological and viral processes. This functional diversity is a result of their dynamic nature; the variety of three-dimensional structures that they can fold into; and a host of post-transcriptional chemical modifications. While there are many experimental techniques to study the structural dynamics of biomolecules, molecular dynamics simulations (MDS) play a significant role in complementing experimental data and providing mechanistic insights. The accuracy of the results obtained from MDS is determined by the underlying physical models i.e., the force-fields, that steer the simulations. Though RNA force-fields have received a lot of attention in the last decade, they still lag compared to their protein counterparts. The chemical diversity imparted by the RNA modifications adds another layer of complexity to an already challenging problem. Insight into the effect of RNA modifications upon RNA folding and dynamics is lacking due to the insufficiency or absence of relevant experimental data. This review provides an overview of the state of MDS of modified RNA, focusing on the challenges in parameterization of RNA modifications as well as insights into relevant reference experiments necessary for their calibration. 
    more » « less
  4. Abstract

    Structural plasticity is integral to RNA function; however, there are currently few methods to quantitatively resolve RNAs that have multiple structural states. NMR spectroscopy is a powerful approach for resolving conformational ensembles but is size-limited. Chemical probing is well-suited for large RNAs but provides limited structural and kinetics information. Here, we integrate the two approaches to visualize a two-state conformational ensemble for the central stem–loop 3 (SL3) of 7SK RNA, a critical element for 7SK RNA function in transcription regulation. We find that the SL3 distal end exchanges between two equally populated yet structurally distinct states in both isolated SL3 constructs and full-length 7SK RNA. We rationally designed constructs that lock SL3 into a single state and demonstrate that both chemical probing and NMR data fit to a linear combination of the two states. Comparison of vertebrate 7SK RNA sequences shows either or both states are highly conserved. These results provide new insights into 7SK RNA structural dynamics and demonstrate the utility of integrating chemical probing with NMR spectroscopy to gain quantitative insights into RNA conformational ensembles.

     
    more » « less
  5. We develop optimal algorithms for learning undirected Gaussian trees and directed Gaussian polytrees from data. We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery). The first approach is based on the Chow-Liu algorithm, and learns an optimal tree-structured distribution efficiently. The second approach is a modification of the PC algorithm for polytrees that uses partial correlation as a conditional independence tester for constraint-based structure learning. We derive explicit finite-sample guarantees for both approaches, and show that both approaches are optimal by deriving matching lower bounds. Additionally, we conduct numerical experiments to compare the performance of various algorithms, providing further insights and empirical evidence. 
    more » « less