This content will become publicly available on August 22, 2025
- Award ID(s):
- 1901324
- PAR ID:
- 10549874
- Publisher / Repository:
- Springer
- Date Published:
- Journal Name:
- Nature Nanotechnology
- ISSN:
- 1748-3387
- Subject(s) / Keyword(s):
- DNA storage, molecular information, transcription, data, dendricolloid, computer, computation
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
null (Ed.)Abstract The potential of DNA as an information storage medium is rapidly growing due to advances in DNA synthesis and sequencing. However, the chemical stability of DNA challenges the complete erasure of information encoded in DNA sequences. Here, we encode information in a DNA information solution, a mixture of true message- and false message-encoded oligonucleotides, and enables rapid and permanent erasure of information. True messages are differentiated by their hybridization to a "truth marker” oligonucleotide, and only true messages can be read; binding of the truth marker can be effectively randomized even with a brief exposure to the elevated temperature. We show 8 separate bitmap images can be stably encoded and read after storage at 25 °C for 65 days with an average of over 99% correct information recall, which extrapolates to a half-life of over 15 years at 25 °C. Heating to 95 °C for 5 minutes, however, permanently erases the message.more » « less
-
null (Ed.)Abstract Escherichia coli SSB (EcSSB) is a model single-stranded DNA (ssDNA) binding protein critical in genome maintenance. EcSSB forms homotetramers that wrap ssDNA in multiple conformations to facilitate DNA replication and repair. Here we measure the binding and wrapping of many EcSSB proteins to a single long ssDNA substrate held at fixed tensions. We show EcSSB binds in a biphasic manner, where initial wrapping events are followed by unwrapping events as ssDNA-bound protein density passes critical saturation and high free protein concentration increases the fraction of EcSSBs in less-wrapped conformations. By destabilizing EcSSB wrapping through increased substrate tension, decreased substrate length, and protein mutation, we also directly observe an unstable bound but unwrapped state in which ∼8 nucleotides of ssDNA are bound by a single domain, which could act as a transition state through which rapid reorganization of the EcSSB–ssDNA complex occurs. When ssDNA is over-saturated, stimulated dissociation rapidly removes excess EcSSB, leaving an array of stably-wrapped complexes. These results provide a mechanism through which otherwise stably bound and wrapped EcSSB tetramers are rapidly removed from ssDNA to allow for DNA maintenance and replication functions, while still fully protecting ssDNA over a wide range of protein concentrations.more » « less
-
Many scientific applications operate on data sets that span hundreds of Gigabytes or even Terabytes in size. Large data sets often use compression to reduce the size of the files. Yet as of today, parallel I/O libraries do not support reading and writing compressed files, necessitating either expensive sequential compression/decompression operations before/after the simulation, or omitting advanced features of parallel I/O libraries, such as collective I/O operations. This paper introduces parallel I/O on compressed data files, discusses the key challenges, requirements, and solutions for supporting compressed data files in MPI I/O, as well as limitations on some MPI I/O operations when using compressed data files. The paper details handling of individual read and write operations of compressed data files, and presents an extension to the two-phase collective I/O algorithm to support data compression. The paper further presents and evaluates an implementation based on the Snappy compression library and the OMPIO parallel I/O framework. The performance evaluation using multiple data sets demonstrate significant performance benefits when using data compression on a parallel BeeGFS file system.more » « less
-
Abstract The storage of data in DNA typically involves encoding and synthesizing data into short oligonucleotides, followed by reading with a sequencing instrument. Major challenges include the molecular consumption of synthesized DNA, basecalling errors, and limitations with scaling up read operations for individual data elements. Addressing these challenges, we describe a DNA storage system called MDRAM (Magnetic DNA-based Random Access Memory) that enables repetitive and efficient readouts of targeted files with nanopore-based sequencing. By conjugating synthesized DNA to magnetic agarose beads, we enabled repeated data readouts while preserving the original DNA analyte and maintaining data readout quality. MDRAM utilizes an efficient convolutional coding scheme that leverages soft information in raw nanopore sequencing signals to achieve information reading costs comparable to Illumina sequencing despite higher error rates. Finally, we demonstrate a proof-of-concept DNA-based proto-filesystem that enables an exponentially-scalable data address space using only small numbers of targeting primers for assembly and readout.
-
Abstract Land use change (LUC) alters the global carbon (C) stock, but our estimation of the alteration remains uncertain and is a major impediment to predicting the global C cycle. The uncertainty is partly due to the limited number and geographical bias of observations, and limited exploration of its predictors. Here we generated a comprehensive global database of 5,980 observations from 790 articles. The number of sites evaluated is at least seven times larger than in previous meta‐analyses. Our constrained estimates of different LUC's effects on soil organic C (SOC) and their variations across global climates reveal underestimation/overestimation in previous estimates. Converting forests and grasslands to croplands reduced SOC by 24.5% ± 1.53% (−11.03 ± 1.06 Mg ha−1) and 22.7% ± 1.22% (−8.09 ± 0.67 Mg ha−1), while 28.0% ± 1.56% (4.46 ± 0.42 Mg ha−1) and 33.5% ± 1.68% (5.8 ± 0.38 Mg ha−1) increases, respectively, were obtained in the reverse processes. Converting forests to grasslands decreased SOC by 2.1% ± 1.22% (−1.13 ± 0.44 Mg ha−1), while the reverse process increased SOC by 18.6% ± 1.73% (3.31 ± 0.51 Mg ha−1). Modeled relative importance of 10 drivers of LUC's impact on SOC revealed that higher initial SOC (iSOC) does not solely determine SOC loss in SOC‐negative LUC scenarios as previously proposed. Across four decades, reconverting croplands to forests and grasslands recovered only 49.5% (6.1 ± 0.51 Mg ha−1) and 75.3% (7.0 ± 0.38 Mg ha−1) of the iSOC, respectively, indicating the need for protecting C‐rich ecosystems. Our global data set advances information on LUC's effect on SOC and can be valuable to constrain Earth system models to reliably estimate global SOC stocks and plan climate change mitigation strategies.