skip to main content

Title: Formal Analysis of Rewriting System Representing RNA Folding [Formal Analysis of Rewriting System Representing RNA Folding]
Prediction of RNA structure is an important problem in understanding biological processes in living organism. Computational models have been created to study the processes with the aim of unravelling the RNA structure. In this work, a novel formalism for formal analysis of RNA structure prediction is described. A graph rewriting system is formalized to represent structural dynamics of RNA structure under uncertainty. Probabilistic model checking is performed on queries seeking structural properties in RNA. Experiments were conducted to evaluate the computational feasibility of the model.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies - BIOINFORMATICS, (BIOSTEC 2023) ISBN 978-989-758-631-6; ISSN 2184-4305, SciTePress,
Page Range / eLocation ID:
235 to 242
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Riboswitches are conserved structural ribonucleic acid (RNA) sensors that are mainly found to regulate a large number of genes/operons in bacteria. Presently, >50 bacterial riboswitch classes have been discovered, but only the thiamine pyrophosphate riboswitch class is detected in a few eukaryotes like fungi, plants and algae. One of the most important challenges in riboswitch research is to discover existing riboswitch classes in eukaryotes and to understand the evolution of bacterial riboswitches. However, traditional search methods for riboswitch detection have failed to detect eukaryotic riboswitches besides just one class and any distant structural homologs of riboswitches. We developed a novel approach based on inverse RNA folding that attempts to find sequences that match the shape of the target structure with minimal sequence conservation based on key nucleotides that interact directly with the ligand. Then, to support our matched candidates, we expanded the results into a covariance model representing similar sequences preserving the structure. Our method transforms a structure-based search into a sequence-based search that considers the conservation of secondary structure shape and ligand-binding residues. This method enables us to identify a potential structural candidate in fungi that could be the distant homolog of bacterial purine riboswitches. Further, phylogenomic analysis and evolutionary distribution of this structural candidate indicate that the most likely point of origin of this structural candidate in these organisms is associated with the loss of traditional purine riboswitches. The computational approach could be applicable to other domains and problems in RNA research.

    more » « less
  2. A synthetic biology approach toward constructing an RNA-based genome expands our understanding of living things and opens avenues for technological advancement. For the precise design of an artificial RNA replicon either from scratch or based on a natural RNA replicon, understanding structure–function relationships of RNA sequences is critical. However, our knowledge remains limited to a few particular structural elements intensively studied so far. Here, we conducted a series of site-directed mutagenesis studies of yeast narnaviruses ScNV20S and ScNV23S, perhaps the simplest natural autonomous RNA replicons, to identify RNA elements required for maintenance and replication. RNA structure disruption corresponding to various portions of the entire narnavirus genome suggests that pervasive RNA folding, in addition to the precise secondary structure of genome termini, is essential for maintenance of the RNA replicon in vivo. Computational RNA structure analyses suggest that this scenario likely applies to other “narna-like" viruses. This finding implies selective pressure on these simplest autonomous natural RNA replicons to fold into a unique structure that acquires both thermodynamic and biological stability. We propose the importance of pervasive RNA folding for the design of RNA replicons that could serve as a platform for in vivo continuous evolution as well as an interesting model to study the origin of life. 
    more » « less
  3. Abstract This work seeks to remedy two deficiencies in the current nucleic acid nanotechnology software environment: the lack of both a fast and user-friendly visualization tool and a standard for structural analyses of simulated systems. We introduce here oxView, a web browser-based visualizer that can load structures with over 1 million nucleotides, create videos from simulation trajectories, and allow users to perform basic edits to DNA and RNA designs. We additionally introduce open-source software tools for extracting common structural parameters to characterize large DNA/RNA nanostructures simulated using the coarse-grained modeling tool, oxDNA, which has grown in popularity in recent years and is frequently used to prototype new nucleic acid nanostructural designs, model biophysics of DNA/RNA processes, and rationalize experimental results. The newly introduced software tools facilitate the computational characterization of DNA/RNA designs by providing multiple analysis scripts, including mean structures and structure flexibility characterization, hydrogen bond fraying, and interduplex angles. The output of these tools can be loaded into oxView, allowing users to interact with the simulated structure in a 3D graphical environment and modify the structures to achieve the required properties. We demonstrate these newly developed tools by applying them to design and analysis of a range of DNA/RNA nanostructures. 
    more » « less
  4. Abstract Nearly two decades after Westhof and Michel first proposed that RNA tetraloops may interact with distal helices, tetraloop–receptor interactions have been recognized as ubiquitous elements of RNA tertiary structure. The unique architecture of GNRA tetraloops ( N =any nucleotide, R =purine) enables interaction with a variety of receptors, e.g., helical minor grooves and asymmetric internal loops. The most common example of the latter is the GAAA tetraloop–11 nt tetraloop receptor motif. Biophysical characterization of this motif provided evidence for the modularity of RNA structure, with applications spanning improved crystallization methods to RNA tectonics. In this review, we identify and compare types of GNRA tetraloop–receptor interactions. Then we explore the abundance of structural, kinetic, and thermodynamic information on the frequently occurring and most widely studied GAAA tetraloop–11 nt receptor motif. Studies of this interaction have revealed powerful paradigms for structural assembly of RNA, as well as providing new insights into the roles of cations, transition states and protein chaperones in RNA folding pathways. However, further research will clearly be necessary to characterize other tetraloop–receptor and long-range tertiary binding interactions in detail – an important milestone in the quantitative prediction of free energy landscapes for RNA folding. 
    more » « less
  5. Abstract Motivation RNA secondary structure prediction is widely used to understand RNA function. Recently, there has been a shift away from the classical minimum free energy methods to partition function-based methods that account for folding ensembles and can therefore estimate structure and base pair probabilities. However, the classical partition function algorithm scales cubically with sequence length, and is therefore prohibitively slow for long sequences. This slowness is even more severe than cubic-time free energy minimization due to a substantially larger constant factor in runtime. Results Inspired by the success of our recent LinearFold algorithm that predicts the approximate minimum free energy structure in linear time, we design a similar linear-time heuristic algorithm, LinearPartition, to approximate the partition function and base-pairing probabilities, which is shown to be orders of magnitude faster than Vienna RNAfold and CONTRAfold (e.g. 2.5 days versus 1.3 min on a sequence with length 32 753 nt). More interestingly, the resulting base-pairing probabilities are even better correlated with the ground-truth structures. LinearPartition also leads to a small accuracy improvement when used for downstream structure prediction on families with the longest length sequences (16S and 23S rRNAs), as well as a substantial improvement on long-distance base pairs (500+ nt apart). Availability and implementation Code:; Server: Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less