Abstract DNA has emerged as a promising material to address growing data storage demands. We recently demonstrated a structure-based DNA data storage approach where DNA probes are spatially oriented on the surface of DNA origami and decoded using DNA-PAINT. In this approach, larger origami structures could improve the efficiency of reading and writing data. However, larger origami require long single-stranded DNA scaffolds that are not commonly available. Here, we report the engineering of a novel longer DNA scaffold designed to produce a larger rectangle origami needed to expand the origami-based digital nucleic acid memory (dNAM) approach. We confirmed that this scaffold self-assembled into the correct origami platform and correctly positioned DNA data strands using atomic force microscopy and DNA-PAINT super-resolution microscopy. This larger structure enables a 67% increase in the number of data points per origami and will support efforts to efficiently scale up origami-based dNAM.
more »
« less
An alternative approach to nucleic acid memory
Abstract DNA is a compelling alternative to non-volatile information storage technologies due to its information density, stability, and energy efficiency. Previous studies have used artificially synthesized DNA to store data and automated next-generation sequencing to read it back. Here, we report digital Nucleic Acid Memory (dNAM) for applications that require a limited amount of data to have high information density, redundancy, and copy number. In dNAM, data is encoded by selecting combinations of single-stranded DNA with (1) or without (0) docking-site domains. When self-assembled with scaffold DNA, staple strands form DNA origami breadboards. Information encoded into the breadboards is read by monitoring the binding of fluorescent imager probes using DNA-PAINT super-resolution microscopy. To enhance data retention, a multi-layer error correction scheme that combines fountain and bi-level parity codes is used. As a prototype, fifteen origami encoded with ‘Data is in our DNA!\n’ are analyzed. Each origami encodes unique data-droplet, index, orientation, and error-correction information. The error-correction algorithms fully recover the message when individual docking sites, or entire origami, are missing. Unlike other approaches to DNA-based data storage, reading dNAM does not require sequencing. As such, it offers an additional path to explore the advantages and disadvantages of DNA as an emerging memory material.
more »
« less
- Award ID(s):
- 1807809
- PAR ID:
- 10223239
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 12
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract Background Third-generation single molecule sequencing technologies can sequence long reads, which is advancing the frontiers of genomics research. However, their high error rates prohibit accurate and efficient downstream analysis. This difficulty has motivated the development of many long read error correction tools, which tackle this problem through sampling redundancy and/or leveraging accurate short reads of the same biological samples. Existing studies to asses these tools use simulated data sets, and are not sufficiently comprehensive in the range of software covered or diversity of evaluation measures used. Results In this paper, we present a categorization and review of long read error correction methods, and provide a comprehensive evaluation of the corresponding long read error correction tools. Leveraging recent real sequencing data, we establish benchmark data sets and set up evaluation criteria for a comparative assessment which includes quality of error correction as well as run-time and memory usage. We study how trimming and long read sequencing depth affect error correction in terms of length distribution and genome coverage post-correction, and the impact of error correction performance on an important application of long reads, genome assembly. We provide guidelines for practitioners for choosing among the available error correction tools and identify directions for future research. Conclusions Despite the high error rate of long reads, the state-of-the-art correction tools can achieve high correction quality. When short reads are available, the best hybrid methods outperform non-hybrid methods in terms of correction quality and computing resource usage. When choosing tools for use, practitioners are suggested to be careful with a few correction tools that discard reads, and check the effect of error correction tools on downstream analysis. Our evaluation code is available as open-source at https://github.com/haowenz/LRECE .more » « less
-
Abstract Deoxyribonucleic acid (DNA) has emerged as a promising building block for next-generation ultra-high density storage devices. Although DNA has high durability and extremely high density in nature, its potential as the basis of storage devices is currently hindered by limitations such as expensive and complex fabrication processes and time-consuming read–write operations. In this article, we propose the use of a DNA crossbar array architecture for an electrically readable read-only memory (DNA-ROM). While information can be ‘written’ error-free to a DNA-ROM array using appropriate sequence encodings its read accuracy can be affected by several factors such as array size, interconnect resistance, and Fermi energy deviations from HOMO levels of DNA strands employed in the crossbar. We study the impact of array size and interconnect resistance on the bit error rate of a DNA-ROM array through extensive Monte Carlo simulations. We have also analyzed the performance of our proposed DNA crossbar array for an image storage application, as a function of array size and interconnect resistance. While we expect that future advances in bioengineering and materials science will address some of the fabrication challenges associated with DNA crossbar arrays, we believe that the comprehensive body of results we present in this paper establishes the technical viability of DNA crossbar arrays as low power, high-density storage devices. Finally, our analysis of array performance vis-à-vis interconnect resistance should provide valuable insights into aspects of the fabrication process such as proper choice of interconnects necessary for ensuring high read accuracies.more » « less
-
With the expanding data storage capacity needs, DNA as an alternative to the archival storage medium offers potential advantages, including higher density and data retention for information storage1,2. However, the majority of DNA-based memory systems are write-once and read-only, although few studies have suggested overwriting digital data on the existing DNA using chemical modifications of bases 3. Using those strategies requires constantly updating the entire data coding and iteratively synthesizing the DNA pool. Therefore, considering the complexity and cost, those methods needed some amendments to become industrially scalable. Inspired by magnetic tapes4 and multisession-CD5, in this work, we created a DNA storage system coined the Molecular File System (MolFS), to organize, store, and edit digital information in a DNA pool. MolFS uses DNA pools that consist of multiple sessions, where each session contains data block and unique index sections to store and edit the files. We used indexes to describe the file system hierarchy, locate files along with the blocks, recognize the sessions, and identify the file versions. This approach reduces the editing cost compared to the state-of-the-art methods, and editing or adding data requires only synthesizing a new DNA pool containing the DNA session of the differential file. As proof of concept, we encoded 2.3 Kbytes of graphic and text data into 2 DNA pools. To edit the existing DNA pool, we added 8 new differential data blocks to existing pools, reaching 13.8 Kbytes of data stored from sessions 1 to 5. We performed nanopore sequencing and recovered the data from the MolFS sessions accurately and precisely.more » « less
-
null (Ed.)Abstract The potential of DNA as an information storage medium is rapidly growing due to advances in DNA synthesis and sequencing. However, the chemical stability of DNA challenges the complete erasure of information encoded in DNA sequences. Here, we encode information in a DNA information solution, a mixture of true message- and false message-encoded oligonucleotides, and enables rapid and permanent erasure of information. True messages are differentiated by their hybridization to a "truth marker” oligonucleotide, and only true messages can be read; binding of the truth marker can be effectively randomized even with a brief exposure to the elevated temperature. We show 8 separate bitmap images can be stably encoded and read after storage at 25 °C for 65 days with an average of over 99% correct information recall, which extrapolates to a half-life of over 15 years at 25 °C. Heating to 95 °C for 5 minutes, however, permanently erases the message.more » « less
An official website of the United States government
