skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Domain-Based Nucleic-Acid Minimum Free Energy: Algorithmic Hardness and Parameterized Bounds
Molecular programmers and nanostructure engineers use domain-level design to abstract away messy DNA/RNA sequence, chemical and geometric details. Such domain-level abstractions are enforced by sequence design principles and provide a key principle that allows scaling up of complex multistranded DNA/RNA programs and structures. Determining the most favoured secondary structure, or Minimum Free Energy (MFE), of a set of strands, is typically studied at the sequence level but has seen limited domain-level work. We analyse the computational complexity of MFE for multistranded systems in a simple setting were we allow only 1 or 2 domains per strand. On the one hand, with 2-domain strands, we find that the MFE decision problem is NP-complete, even without pseudoknots, and requires exponential time algorithms assuming SAT does. On the other hand, in the simplest case of 1-domain strands there are efficient MFE algorithms for various binding modes. However, even in this single-domain case, MFE is P-hard for promiscuous binding, where one domain may bind to multiple as experimentally used by Nikitin [Nat Chem., 2023], which in turn implies that strands consisting of a single domain efficiently implement arbitrary Boolean circuits.  more » « less
Award ID(s):
2329918
PAR ID:
10562753
Author(s) / Creator(s):
; ; ; ; ; ; ;
Editor(s):
Seki, Shinnosuke; Stewart, Jaimie Marie
Publisher / Repository:
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Date Published:
Volume:
314
ISSN:
1868-8969
ISBN:
978-3-95977-344-7
Page Range / eLocation ID:
314-314
Subject(s) / Keyword(s):
Domain-based DNA designs minimum free energy efficient algorithms NP-hard P-hard NC fixed-parameter tractable Theory of computation → Problems, reductions and completeness Computer systems organization → Molecular computing
Format(s):
Medium: X Size: 24 pages; 1119354 bytes Other: application/pdf
Size(s):
24 pages 1119354 bytes
Right(s):
Creative Commons Attribution 4.0 International license; info:eu-repo/semantics/openAccess
Sponsoring Org:
National Science Foundation
More Like this
  1. Belazzougui, Djamal; Ouangraoua, Aïda (Ed.)
    The problem of designing an RNA sequence v that encodes for a given target protein w plays an important role in messenger RNA (mRNA) vaccine design. Due to codon degeneracy, there exist exponentially many RNA sequences for a single target protein. These candidate RNA sequences may adopt different secondary structure conformations with varying minimum free energy (MFE), affecting their thermodynamic stability and consequently mRNA half-life. In addition, species-specific codon usage bias, as measured by the codon adaptation index (CAI), also plays an essential role in translation efficiency. While previous works have focused on optimizing either MFE or CAI, more recent works have shown the merits of optimizing both objectives. Importantly, there is a trade-off between MFE and CAI, i.e. optimizing one objective is at the expense of the other. Here, we formulate the Pareto Optimal RNA Design problem, seeking the set of Pareto optimal solutions for which no other solution exists that is better in terms of both MFE and CAI. We introduce DERNA (DEsign RNA), which uses the weighted sum method to enumerate the Pareto front by optimizing convex combinations of both objectives. DERNA uses dynamic programming to solve each convex combination in O(|w|³) time and O(|w|²) space. Compared to a previous approach that only optimizes MFE, we show on a benchmark dataset that DERNA obtains solutions with identical MFE but superior CAI. Additionally, we show that DERNA matches the performance in terms of solution quality of LinearDesign, a recent approach that similarly seeks to balance MFE and CAI. Finally, we demonstrate our method’s potential for mRNA vaccine design using SARS-CoV-2 spike as the target protein. 
    more » « less
  2. Chen, Ho-Lin; Evans, Constantine G. (Ed.)
    Polynomial time dynamic programming algorithms play a crucial role in the design, analysis and engineering of nucleic acid systems including DNA computers and DNA/RNA nanostructures. However, in complex multistranded or pseudoknotted systems, computing the minimum free energy (MFE), and partition function of nucleic acid systems is NP-hard. Despite this, multistranded and/or pseudoknotted systems represent some of the most utilised and successful systems in the field. This leaves open the tempting possibility that many of the kinds of multistranded and/or pseudoknotted systems we wish to engineer actually fall into restricted classes, that do in fact have polynomial time algorithms, but we've just not found them yet. Here, we give polynomial time algorithms for MFE and partition function calculation for a restricted kind of multistranded system called the 1D scaffolded DNA computer. This model of computation thermodynamically favours correct outputs over erroneous states, simulates finite state machines in 1D and Boolean circuits in 2D, and is amenable to DNA storage applications. In an effort to begin to ask the question of whether we can naturally compare the expressivity of nucleic acid systems based on the computational complexity of prediction of their preferred energetic states, we show our MFE problem is in logspace (the complexity class L), making it perhaps one of the simplest known, natural, nucleic acid MFE problems. Finally, we provide a stochastic kinetic simulator for the 1D scaffolded DNA computer and evaluate strategies for efficiently speeding up this thermodynamically favourable system in a constant-temperature kinetic regime. 
    more » « less
  3. Estrogen receptor alpha (ERα) is a ligand-responsive transcription factor critical for sex determination and development. Recent reports challenge the canonical view of ERα function by suggesting an activity beyond binding dsDNA at estrogen-responsive promotor elements: association with RNAs in vivo. Whether these interactions are direct or indirect remains unknown, which limits the ability to understand the extent, specificity, and biological role of ERα-RNA binding. Here we demonstrate that an extended DNA-binding domain of ERα directly binds a wide range of RNAs in vitro with structural specificity. ERα binds RNAs that adopt a range of hairpin-derived structures independent of sequence, while interacting poorly with single- and double-stranded RNA. RNA affinities are only 4-fold weaker than consensus dsDNA and significantly tighter than nonconsensus dsDNA sequences. Moreover, RNA binding is competitive with DNA binding. Together, these data show that ERα utilizes an extended DNA-binding domain to achieve a high-affinity/low-specificity mode for interacting with RNA. 
    more » « less
  4. DNA nanotechnology uses oligonucleotide strands to assemble molecular structures capable of performing useful operations. Here, we assembled a multifunctional prototype DNA nanodevice, DOCTR, that recognizes a single nucleotide mutation in a cancer marker RNA. The nanodevice then cuts out a signature sequence and uses it as an activator for a "therapeutic" function, namely, the cleavage of another RNA sequence. The proposed design is a prototype for a gene therapy DNA machine that cleaves a housekeeping gene only in the presence of a cancer-causing point mutation and suppresses cancer cells exclusively with minimal side effects to normal cells. 
    more » « less
  5. Abstract The Caulobacter crescentus cell cycle-regulated DNA methyltransferase (CcrM) methylates the adenine of hemimethylated GANTC after replication. Here we present the structure of CcrM in complex with double-stranded DNA containing the recognition sequence. CcrM contains an N-terminal methyltransferase domain and a C-terminal nonspecific DNA-binding domain. CcrM is a dimer, with each monomer contacting primarily one DNA strand: the methyltransferase domain of one molecule binds the target strand, recognizes the target sequence, and catalyzes methyl transfer, while the C-terminal domain of the second molecule binds the non-target strand. The DNA contacts at the 5-base pair recognition site results in dramatic DNA distortions including bending, unwinding and base flipping. The two DNA strands are pulled apart, creating a bubble comprising four recognized base pairs. The five bases of the target strand are recognized meticulously by stacking contacts, van der Waals interactions and specific Watson–Crick polar hydrogen bonds to ensure high enzymatic specificity. 
    more » « less