skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Scalable and Interpretable Identification of Minimal Undesignable RNA Structure Motifs with Rotational Invariance
RNA design aims to find a sequence that folds with the highest probability into a designated target structure. However, certain structures are undesignable, meaning no sequence can fold into the target structure under the default (Turner) RNA folding model. Understanding the specific local structures (i.e., “motifs”) that contribute to undesignability is crucial for refining RNA folding models and determining the limits of RNA designability. Despite its importance, this problem has received very little attention, and previous efforts are neither scalable nor interpretable. We develop a new theoretical framework for motif (un-)designability, and design scalable and interpretable algorithms to identify minimal undesignable motifs within a given RNA secondary structure. Our approach establishes motif undesignability by searching for rival motifs, rather than exhaustively enumerating all (partial) sequences that could potentially fold into the motif. Furthermore, we exploit rotational invariance in RNA structures to detect, group and reuse equivalent motifs and to construct a database of unique minimal undesignable motifs. To achieve that, we propose a loop-pair graph representation for motifs and a recursive graph isomorphism algorithm for motif equivalence. Our algorithms successfully identify 24 unique minimal undesignable motifs among 18 undesignable puzzles from the Eterna100 benchmark. Surprisingly, we also find over 350 unique minimal undesignable motifs and 663 undesignable native structures in the ArchiveII dataset, drawn from a diverse set of RNA families.  more » « less
Award ID(s):
2330737
PAR ID:
10673295
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
SPRINGER
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. RNA design aims to find a sequence that folds with the highest probability into a designated target structure. However, certain structures are undesignable, meaning no sequence can fold into the target structure under the default (Turner) RNA folding model. Understanding the specific local structures (i.e., “motifs”) that contribute to undesignability is crucial for refining RNA folding models and determining the limits of RNA designability. Despite its importance, this problem has received very little attention, and previous efforts are neither scalable nor interpretable. We develop a new theoretical framework for motif (un-)designability, and design scalable and interpretable algorithms to identify minimal undesignable motifs within a given RNA secondary structure. Our approach establishes motif undesignability by searching for rival motifs, rather than exhaustively enumerating all (partial) sequences that could potentially fold into the motif. Furthermore, we exploit rotational invariance in RNA structures to detect, group and reuse equivalent motifs and to construct a database of unique minimal undesignable motifs. To achieve that, we propose a loop-pair graph representation for motifs and a recursive graph isomorphism algorithm for motif equivalence. Our algorithms successfully identify 24 unique minimal undesignable motifs among 18 undesignable puzzles from the Eterna100 benchmark. Surprisingly, we also find over 350 unique minimal undesignable motifs and 663 undesignable native structures in the ArchiveII dataset, drawn from a diverse set of RNA families. 
    more » « less
  2. RNA design aims to find a sequence that folds with the highest probability into a designated target structure. However, certain structures are undesignable, meaning no sequence can fold into the target structure under the default (Turner) RNA folding model. Understanding the specific local structures (i.e., “motifs”) that contribute to undesignability is crucial for refining RNA folding models and determining the limits of RNA designability. Despite its importance, this problem has received very little attention, and previous efforts are neither scalable nor interpretable. We develop a new theoretical framework for motif (un-)designability, and design scalable and interpretable algorithms to identify minimal undesignable motifs within a given RNA secondary structure. Our approach establishes motif undesignability by searching for rival motifs, rather than exhaustively enumerating all (partial) sequences that could potentially fold into the motif. Furthermore, we exploit rotational invariance in RNA structures to detect, group and reuse equivalent motifs and to construct a database of unique minimal undesignable motifs. To achieve that, we propose a loop-pair graph representation for motifs and a recursive graph isomorphism algorithm for motif equivalence. Our algorithms successfully identify 24 unique minimal undesignable motifs among 18 undesignable puzzles from the Eterna100 benchmark. Surprisingly, we also find over 350 unique minimal undesignable motifs and 663 undesignable native structures in the ArchiveII dataset, drawn from a diverse set of RNA families. 
    more » « less
  3. RNA design aims to find a sequence that folds into a designated target structure under a specific RNA folding model, also known as the inverse problem of RNA folding. While numerous RNA design methods have been invented to search for sequences capable of folding into a target structure under the default (Turner) RNA folding model, little attention has been given to the identification of undesignable structures. This work bridges the gap between RNA design and undesignability by introducing a series of theorems and algorithms aimed at identifying both undesignable structures and their causative local structural components, which we define asminimal undesignable motifs. We first present theorems that provide sufficient conditions for recognizing undesignability structures and propose efficient, theorem-guided algorithms to verify whether an RNA structure is undesignable. While such global undesignability sheds light on the limits of RNA design, identifying the specific motifs responsible for undesignability is critical for understanding RNA folding models and advancing design methodologies. To this end, we develop a new theoretical framework for motif undesignability and propose scalable and interpretable algorithms to identify minimal undesignable motifs within a given RNA secondary structure. Our approach establishes motif undesignability by searching for rival motifs, rather than exhaustively enumerating all (partial) sequences that could potentially fold into the motif. Furthermore, we exploit rotational invariance in RNA structures to detect, group, and reuse equivalent motifs and to construct a database of unique minimal undesignable motifs. To achieve that, we propose a loop-pair graph representation for motifs and a recursive graph isomorphism algorithm for motif equivalence. Our algorithms successfully identified 24 unique minimal undesignable motifs among 18 undesignable puzzles from the Eterna100 benchmark. Surprisingly, we also find over 350 unique minimal undesignable motifs and 663 undesignable native structures in the ArchiveII dataset, drawn from a diverse set of RNA families. Last but not least, we demonstrate that our theory and algorithms can handle motifs with external loops—a critical advancement given the substantial impact of external loops on the quantity, diversity, and designability of RNA structure motifs. 
    more » « less
  4. RNA design aims to find a sequence that can fold into a given (secondary) structure, which has wide applications in science and medicine. However, it has long been known that there are “undesignable structures” for which no sequence can fold into them according to the minimum free energy (MFE) criterion under the standard RNA folding energy model. Our previous work showed that undesignable structures can be effectively and efficiently identified by searching for rival structures. We further showed that there exist “minimal undesignable motifs” within those undesignable structures, where a (structural) motif is a set of consecutive loops and helices within a secondary structure. To better illustrate our theoretical findings, we built a motif server as a user-friendly visualization tool for undesignable RNA structures and motifs, as well as an interactive demo tool where the user can input a new structure and the server will compute and visualize any undesignable motifs within it on the fly. This server maintains a database of undesignable RNA structures and unique minimal undesignable RNA motifs, allowing the users to explore, visualize, analyze, and identify undesignable motifs in existing and new RNA structures. The importance of this server is that it provides a database of motifs for nanostructure design that should not be incorporated because these motifs are unlikely to be achievable. Availability: Our server is at https://linearfold.eecs.oregonstate.edu/motifserver. 
    more » « less
  5. RNA design is the search for a sequence or set of sequences that will fold into predefined structures, also known as the inverse problem of RNA folding. While numerous RNA design methods have been invented to find sequences capable of folding into a target structure, little attention has been given to the identification of undesignable structures according to the minimum free energy ( ) criterion under the Turner model. In this paper, we address this gap by first introducing mathematical theorems outlining sufficient conditions for recognizing undesignable structures, then proposing efficient algorithms, guided by these theorems, to verify the undesignability of RNA structures. Through the application of these theorems and algorithms to the Eterna100 puzzles, we demonstrate the ability to efficiently establish that 15 of the puzzles indeed fall within the category of undesignable structures. In addition, we provide specific insights from the study of undesignability, in the hope that it will enable more understanding of RNA folding and RNA design. Availability: Our source code is available at https://github.com/shanry/RNA-Undesign. 
    more » « less