Discoveries of RNA roles in cellular physiology and pathology are increasing the need for new tools that modulate the structure and function of these biomolecules, and small molecules are proving useful. In 2017, we curated the RNA-targeted BIoactive ligaNd Database (R-BIND) and discovered distinguishing physicochemical properties of RNA-targeting ligands, leading us to propose the existence of an “RNA-privileged” chemical space. Biennial updates of the database and the establishment of a website platform (rbind.chem.duke.edu) have provided new insights and tools to design small molecules based on the analyzed physicochemical and spatial properties. In this report and R-BIND 2.0 update, we refined the curation approach and ligand classification system as well as conducted analyses of RNA structure elements for the first time to identify new targeting strategies. Specifically, we curated and analyzed RNA target structural motifs to determine the properties of small molecules that may confer selectivity for distinct RNA secondary and tertiary structures. Additionally, we collected sequences of target structures and incorporated an RNA structure search algorithm into the website that outputs small molecules targeting similar motifs without a priori secondary structure knowledge. Cheminformatic analyses revealed that, despite the 50% increase in small molecule library size, the distinguishing properties of R-BIND ligands remained significantly different from that of proteins and are therefore still relevant to RNA-targeted probe discovery. Combined, we expect these novel insights and website features to enable the rational design of RNA-targeted ligands and to serve as a resource and inspiration for a variety of scientists interested in RNA targeting.
more »
« less
Distance Profiles of Optimal RNA Foldings
Predicting the secondary structure of RNA is an important problem in molecular biology, providing insights into the function of non-coding Rn As and with broad applications in understanding disease, the development of new drugs, among others. Combinatorial algorithms for predicting RNA foldings can generate an exponentially large number of equally optimal foldings with respect to a given optimization criterion, making it difficult to determine how well any single folding represents the entire space. We provide efficient new algorithms for providing insights into this large space of optimal RNA foldings and a research software tool, toRNAdo, that implements these algorithms.
more »
« less
- Award ID(s):
- 2231150
- PAR ID:
- 10436135
- Editor(s):
- Bansal, M
- Date Published:
- Journal Name:
- Bioinformatics Research and Applications: 18th International Symposium, ISBRA 2022, Haifa, Israel, November 14–17, 2022, Proceedings
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract MotivationPredicting the secondary structure of an ribonucleic acid (RNA) sequence is useful in many applications. Existing algorithms [based on dynamic programming] suffer from a major limitation: their runtimes scale cubically with the RNA length, and this slowness limits their use in genome-wide applications. ResultsWe present a novel alternative O(n3)-time dynamic programming algorithm for RNA folding that is amenable to heuristics that make it run in O(n) time and O(n) space, while producing a high-quality approximation to the optimal solution. Inspired by incremental parsing for context-free grammars in computational linguistics, our alternative dynamic programming algorithm scans the sequence in a left-to-right (5′-to-3′) direction rather than in a bottom-up fashion, which allows us to employ the effective beam pruning heuristic. Our work, though inexact, is the first RNA folding algorithm to achieve linear runtime (and linear space) without imposing constraints on the output structure. Surprisingly, our approximate search results in even higher overall accuracy on a diverse database of sequences with known structures. More interestingly, it leads to significantly more accurate predictions on the longest sequence families in that database (16S and 23S Ribosomal RNAs), as well as improved accuracies for long-range base pairs (500+ nucleotides apart), both of which are well known to be challenging for the current models. Availability and implementationOur source code is available at https://github.com/LinearFold/LinearFold, and our webserver is at http://linearfold.org (sequence limit: 100 000nt). Supplementary informationSupplementary data are available at Bioinformatics online.more » « less
-
Abstract Structured RNA lies at the heart of many central biological processes, from gene expression to catalysis. RNA structure prediction is not yet possible due to a lack of high-quality reference data associated with organismal phenotypes that could inform RNA function. We present GARNET (Gtdb Acquired RNa with Environmental Temperatures), a new database for RNA structural and functional analysis anchored to the Genome Taxonomy Database (GTDB). GARNET links RNA sequences to experimental and predicted optimal growth temperatures of GTDB reference organisms. Using GARNET, we develop sequence- and structure-aware RNA generative models, with overlapping triplet tokenization providing optimal encoding for a GPT-like model. Leveraging hyperthermophilic RNAs in GARNET and these RNA generative models, we identify mutations in ribosomal RNA that confer increased thermostability to theEscherichia coliribosome. The GTDB-derived data and deep learning models presented here provide a foundation for understanding the connections between RNA sequence, structure, and function.more » « less
-
RNA is critical to a broad spectrum of biological and viral processes. This functional diversity is a result of their dynamic nature; the variety of three-dimensional structures that they can fold into; and a host of post-transcriptional chemical modifications. While there are many experimental techniques to study the structural dynamics of biomolecules, molecular dynamics simulations (MDS) play a significant role in complementing experimental data and providing mechanistic insights. The accuracy of the results obtained from MDS is determined by the underlying physical models i.e., the force-fields, that steer the simulations. Though RNA force-fields have received a lot of attention in the last decade, they still lag compared to their protein counterparts. The chemical diversity imparted by the RNA modifications adds another layer of complexity to an already challenging problem. Insight into the effect of RNA modifications upon RNA folding and dynamics is lacking due to the insufficiency or absence of relevant experimental data. This review provides an overview of the state of MDS of modified RNA, focusing on the challenges in parameterization of RNA modifications as well as insights into relevant reference experiments necessary for their calibration.more » « less
-
Abstract Structural plasticity is integral to RNA function; however, there are currently few methods to quantitatively resolve RNAs that have multiple structural states. NMR spectroscopy is a powerful approach for resolving conformational ensembles but is size-limited. Chemical probing is well-suited for large RNAs but provides limited structural and kinetics information. Here, we integrate the two approaches to visualize a two-state conformational ensemble for the central stem–loop 3 (SL3) of 7SK RNA, a critical element for 7SK RNA function in transcription regulation. We find that the SL3 distal end exchanges between two equally populated yet structurally distinct states in both isolated SL3 constructs and full-length 7SK RNA. We rationally designed constructs that lock SL3 into a single state and demonstrate that both chemical probing and NMR data fit to a linear combination of the two states. Comparison of vertebrate 7SK RNA sequences shows either or both states are highly conserved. These results provide new insights into 7SK RNA structural dynamics and demonstrate the utility of integrating chemical probing with NMR spectroscopy to gain quantitative insights into RNA conformational ensembles.more » « less
An official website of the United States government

