skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2330628

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available December 1, 2026
  2. Shao, Mingfu (Ed.)
    Identifying novel and functional RNA structures remains a significant challenge in RNA motif design and is crucial for developing RNA-based therapeutics. Here we introduce a computational topology-based approach with unsupervised machine-learning algorithms to estimate the database size and content of RNA-like graph topologies. Specifically, we apply graph theory enumeration to generate all 110,667 possible 2D dual graphs for vertex numbers ranging from 2 to 9. Among them, only 0.11% (121 dual graphs) correspond to approximately 200,000 known RNA atomic fragments/substructures (collected in 2021) using the RNA-as-Graphs (RAG) framework. The remaining 99.89% of the dual graphs may be RNA-like or non-RNA-like. To determine which dual graphs in the 99.89% hypothetical set are more likely to be associated with RNA structures, we apply computational topology descriptors using the Persistent Spectral Graphs (PSG) method to characterize each graph using 19 PSG-based features and use clustering algorithms that partition all possible dual graphs into two clusters. The cluster with the higher percentage of known dual graphs for RNA is defined as the “RNA-like cluster, while the other is considered as “non-RNA-like. The distance between each dual graph and the center of the RNA-like cluster represents the likelihood of it belonging to RNA structures. From validation, our PSG-based RNA-like cluster includes 97.3% of the 121 known RNA dual graphs, suggesting good performance. Furthermore, 46.017% of the hypothetical RNAs are predicted to be RNA-like. Among the top 15 graphs identified as high-likelihood candidates for novel RNA motifs, 4 were confirmed from the RNA dataset collected in 2022. Significantly, we observe that all the top 15 RNA-like dual graphs can be separated into multiple subgraphs, whereas the top 15 non-RNA-like dual graphs tend not to have any subgraphs (subgraphs preserve pseudoknots and junctions). Moreover, a significant topological difference between top RNA-like and non-RNA-like graphs is evident when comparing their topological features (e.g., Betti-0 and Betti-1 numbers). These findings provide valuable insights into the size of the RNA motif universe and RNA design strategies, offering a novel framework for predicting RNA graph topologies and guiding the discovery of novel RNA motifs, perhaps anti-viral therapeutics by subgraph assembly. 
    more » « less
    Free, publicly-accessible full text available July 15, 2026
  3. Human immunodeficiency virus (HIV) continues to be a threat to public health. An emerging technique with promise in the context of fighting HIV type 1 (HIV-1) focuses on targeting ribosomal frameshifting. A crucial –1 programmed ribosomal frameshift (PRF) has been observed in several pathogenic viruses, including HIV-1. Altered folds of the HIV-1 RNA frameshift element (FSE) have been shown to alter frameshifting efficiency. Here, we use RNA-As-Graphs (RAG), a graph-theory based framework for representing and analyzing RNA secondary structures, to perform conformational analysis in motif space to propose how sequence length may influence folding patterns. This combined analysis, along with all-atom modeling and experimental testing of our designed mutants, has already proven valuable for the SARS-CoV-2 FSE. As a first step to launching the same computational/experimental approach for HIV-1, we compare prior experiments and perform SHAPE-guided 2D-fold predictions for the HIV-1 FSE embedded in increasing sequence contexts and predict structure-altering mutations. We find a highly stable upper stem and highly flexible lower stem for the core FSE, with a three-way junction connecting to other motifs at increasing lengths. In particular, we find little support for a pseudoknot or triplex interaction in the core FSE, although pseudoknots can form separately as a connective motif at longer sequences. We also identify sensitive residues in the upper stem and central loop that, when minimally mutated, alter the core stem loop folding. These insights into the FSE fold and structure-altering mutations can be further pursued by all-atom simulations and experimental testing to advance the mechanistic understanding and therapeutic strategies for HIV-1. 
    more » « less
    Free, publicly-accessible full text available July 1, 2026
  4. Histone modifications play a crucial role in regulating chromatin architecture and gene expression. Here we develop a multiscale model for incorporating methylation in our nucleosome-resolution physics-based chromatin model to investigate the mechanisms by which H3K9 and H3K27 trimethylation (H3K9me3 and H3K27me3) influence chromatin structure and gene regulation. We apply three types of energy terms for this purpose: short-range potentials are derived from all-atom molecular dynamics simulations of wildtype and methylated chromatosomes, which revealed subtle local changes; medium-range potentials are derived by incorporating contacts between HP1 and nucleosomes modified by H3K9me3, to incorporate experimental results of enhanced contacts for short chromatin fibers (12 nucleosomes); for long-range interactions we identify H3K9me3- and H3K27me3-associated contacts based on Hi-C maps with a machine learning approach. These combined multiscale effects can model methylation as a first approximation in our mesoscale chromatin model, and applications to gene systems offer new insights into the epigenetic regulation of genomes mediated by H3K9me3 and H3K27me3. 
    more » « less
    Free, publicly-accessible full text available March 7, 2026
  5. Frameshifting is an essential mechanism employed by many viruses including coronaviruses to produce viral proteins from a compact RNA genome. It is facilitated by specific RNA folds in the frameshift element (FSE), which has emerged as an important therapeutic target. For SARS-CoV-2, a specific 3-stem pseudoknot has been identified to stimulate frameshifting. However, prior studies and our RNA-As-Graphs analysis coupled to chemical reactivity experiments revealed other folds, including a different pseudoknot. Although structural plasticity has been proposed to play a key role in frameshifting, paths between different FSE RNA folds have not been yet identified. Here, we capture atomic-level transition pathways between two key FSE pseudoknots by transition path sampling coupled to Markov State Modeling and our BOLAS free energy method. We reveal multiple transition paths within a heterogeneous, multihub conformational landscape. A shared folding mechanism involves RNA stem unpairing followed by a 5-chain end release. Significantly, this pseudoknot transition critically tunes the tension through the RNA spacer region and places the viral RNA in the narrow ribosomal channel. Our work further explains the role of the alternative pseudoknot in ribosomal pausing and clarifies why the experimentally captured pseudoknot is preferred for frameshifting. Our capturing of this large-scale transition of RNA secondary and tertiary structure highlights the complex pathways of biomolecules and the inherent multifarious aspects that viruses developed to ensure virulence and survival. This enhanced understanding of viral frameshifting also provides insights to target key transitions for therapeutic applications. Our methods are generally applicable to other large-scale biomolecular transitions. 
    more » « less