skip to main content


Title: Metastable hybridization-based DNA information storage to allow rapid and permanent erasure
Abstract The potential of DNA as an information storage medium is rapidly growing due to advances in DNA synthesis and sequencing. However, the chemical stability of DNA challenges the complete erasure of information encoded in DNA sequences. Here, we encode information in a DNA information solution, a mixture of true message- and false message-encoded oligonucleotides, and enables rapid and permanent erasure of information. True messages are differentiated by their hybridization to a "truth marker” oligonucleotide, and only true messages can be read; binding of the truth marker can be effectively randomized even with a brief exposure to the elevated temperature. We show 8 separate bitmap images can be stably encoded and read after storage at 25 °C for 65 days with an average of over 99% correct information recall, which extrapolates to a half-life of over 15 years at 25 °C. Heating to 95 °C for 5 minutes, however, permanently erases the message.  more » « less
Award ID(s):
2019745
NSF-PAR ID:
10233582
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Nature Communications
Volume:
11
Issue:
1
ISSN:
2041-1723
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    DNA is a compelling alternative to non-volatile information storage technologies due to its information density, stability, and energy efficiency. Previous studies have used artificially synthesized DNA to store data and automated next-generation sequencing to read it back. Here, we report digital Nucleic Acid Memory (dNAM) for applications that require a limited amount of data to have high information density, redundancy, and copy number. In dNAM, data is encoded by selecting combinations of single-stranded DNA with (1) or without (0) docking-site domains. When self-assembled with scaffold DNA, staple strands form DNA origami breadboards. Information encoded into the breadboards is read by monitoring the binding of fluorescent imager probes using DNA-PAINT super-resolution microscopy. To enhance data retention, a multi-layer error correction scheme that combines fountain and bi-level parity codes is used. As a prototype, fifteen origami encoded with ‘Data is in our DNA!\n’ are analyzed. Each origami encodes unique data-droplet, index, orientation, and error-correction information. The error-correction algorithms fully recover the message when individual docking sites, or entire origami, are missing. Unlike other approaches to DNA-based data storage, reading dNAM does not require sequencing. As such, it offers an additional path to explore the advantages and disadvantages of DNA as an emerging memory material.

     
    more » « less
  2. Shared register emulations on top of message- passing systems provide an illusion of a simpler shared memory system which can make the task of a system designer easier. Numerous shared register applications have a considerably high read to write ratio. Thus having algorithms that make reads more efficient than writes is a fair trade-off. Typically such algorithms for reads and writes are asymmetric and sacrifice the stringent consistency condition atomicity as it is impossible to have fast reads for multi-writer atomicity. Safety is a consistency condition has has gathered interest from both the systems and theory community as it is weaker than atomicity yet provides strong enough guarantees like “strong consistency” or read-my-write consistency. One requirement that is assumed by many researchers is that of the reliable broadcast (RB) primitive, which ensures the all or none property during a broadcast. One drawback is that such a primitive takes 1.5 rounds to complete. This paper implements an efficient multi-writer multi-reader safe register without using a reliable broadcast primitive. More- over, we provide fast reads or one-shot reads – our read operation can be completed in one round of client-to-server communication. Of course, this comes with the price of requiring more servers when compared to prior solutions assuming reliable broadcast. However, we show that this increased number of servers is indeed necessary as we prove a tight bound on the number of servers required to implement Byzantine-fault tolerant safe registers in a system without reliable broadcast. We extend our results to data stored using erasure coding as well. We present an emulation of single-writer multi-reader safe register based on MDS code. The usage of MDS code reduces storage cost and communication cost. On the negative side, we also show that to use MDS code and achieve one-shot read at the same time, we need even more servers. 
    more » « less
  3. null (Ed.)
    Countering misinformation can reduce belief in the moment, but corrective messages quickly fade from memory. We tested whether the longer-term impact of fact-checks depends on when people receive them. In two experiments (total N = 2,683), participants read true and false headlines taken from social media. In the treatment conditions, “true” and “false” tags appeared before, during, or after participants read each headline. Participants in a control condition received no information about veracity. One week later, participants in all conditions rated the same headlines’ accuracy. Providing fact-checks after headlines ( debunking ) improved subsequent truth discernment more than providing the same information during ( labeling ) or before ( prebunking ) exposure. This finding informs the cognitive science of belief revision and has practical implications for social media platform designers. 
    more » « less
  4. Private information retrieval (PIR) allows a user to retrieve a desired message out of K possible messages from N databases without revealing the identity of the desired message. There has been significant recent progress on understanding fundamental information-theoretic limits of PIR, and in particular the download cost of PIR for several variations. Majority of existing works however, assume the presence of replicated databases, each storing all the K messages. In this work, we consider the problem of PIR from storage constrained databases. Each database has a storage capacity of μKL bits, where K is the number of messages, L is the size of each message in bits, and μ ∈ [1/N, 1] is the normalized storage. In the storage constrained PIR problem, there are two key design questions: a) how to store content across each database under storage constraints; and b) construction of schemes that allow efficient PIR through storage constrained databases. The main contribution of this work is a general achievable scheme for PIR from storage constrained databases for any value of storage. In particular, for any (N, K), with normalized storage μ = t/N, where the parameter t can take integer values t ∈ {1, 2, ..., N}, we show that our proposed PIR scheme achieves a download cost of (1 + 1/t + 1/2 + ⋯ + 1/t K-1 ). The extreme case when μ = 1 (i.e., t = N) corresponds to the setting of replicated databases with full storage. For this extremal setting, our scheme recovers the information-theoretically optimal download cost characterized by Sun and Jafar as (1 + 1/N + ⋯ +1/N K-1 ). For the other extreme, when μ = 1/N (i.e., t = 1), the proposed scheme achieves a download cost of K. The most interesting aspect of the result is that for intermediate values of storage, i.e., 1/N <; μ <; 1, the proposed scheme can strictly outperform memory-sharing between extreme values of storage. 
    more » « less
  5. With the expanding data storage capacity needs, DNA as an alternative to the archival storage medium offers potential advantages, including higher density and data retention for information storage1,2. However, the majority of DNA-based memory systems are write-once and read-only, although few studies have suggested overwriting digital data on the existing DNA using chemical modifications of bases 3. Using those strategies requires constantly updating the entire data coding and iteratively synthesizing the DNA pool. Therefore, considering the complexity and cost, those methods needed some amendments to become industrially scalable. Inspired by magnetic tapes4 and multisession-CD5, in this work, we created a DNA storage system coined the Molecular File System (MolFS), to organize, store, and edit digital information in a DNA pool. MolFS uses DNA pools that consist of multiple sessions, where each session contains data block and unique index sections to store and edit the files. We used indexes to describe the file system hierarchy, locate files along with the blocks, recognize the sessions, and identify the file versions. This approach reduces the editing cost compared to the state-of-the-art methods, and editing or adding data requires only synthesizing a new DNA pool containing the DNA session of the differential file. As proof of concept, we encoded 2.3 Kbytes of graphic and text data into 2 DNA pools. To edit the existing DNA pool, we added 8 new differential data blocks to existing pools, reaching 13.8 Kbytes of data stored from sessions 1 to 5. We performed nanopore sequencing and recovered the data from the MolFS sessions accurately and precisely. 
    more » « less