Abstract While the archival digital memory industry approaches its physical limits, the demand is significantly increasing, therefore alternatives emerge. Recent efforts have demonstrated DNA’s enormous potential as a digital storage medium with superior information durability, capacity, and energy consumption. However, the majority of the proposed systems require on-demand de-novo DNA synthesis techniques that produce a large amount of toxic waste and therefore are not industrially scalable and environmentally friendly. Inspired by the architecture of semiconductor memory devices and recent developments in gene editing, we created a molecular digital data storage system called “DNA Mutational Overwriting Storage” (DMOS) that stores information by leveraging combinatorial, addressable, orthogonal, and independent in vitro CRISPR base-editing reactions to write data on a blank pool of greenly synthesized DNA tapes. As a proof of concept, this work illustrates writing and accurately reading of both a bitmap representation of our school’s logo and the title of this study on the DNA tapes.
more »
« less
Multicomponent molecular memory
Abstract Multicomponent reactions enable the synthesis of large molecular libraries from relatively few inputs. This scalability has led to the broad adoption of these reactions by the pharmaceutical industry. Here, we employ the four-component Ugi reaction to demonstrate that multicomponent reactions can provide a basis for large-scale molecular data storage. Using this combinatorial chemistry we encode more than 1.8 million bits of art historical images, including a Cubist drawing by Picasso. Digital data is written using robotically synthesized libraries of Ugi products, and the files are read back using mass spectrometry. We combine sparse mixture mapping with supervised learning to achieve bit error rates as low as 0.11% for single reads, without library purification. In addition to improved scaling of non-biological molecular data storage, these demonstrations offer an information-centric perspective on the high-throughput synthesis and screening of small-molecule libraries.
more »
« less
- Award ID(s):
- 1941344
- PAR ID:
- 10153308
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 11
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
DNA is an incredibly dense storage medium for digital data. However, computing on the stored information is expensive and slow, requiring rounds of sequencing, in silico computation, and DNA synthesis. Prior work on accessing and modifying data using DNA hybridization or enzymatic reactions had limited computation capabilities. Inspired by the computational power of “DNA strand displacement,” we augment DNA storage with “in-memory” molecular computation using strand displacement reactions to algorithmically modify data in a parallel manner. We show programs for binary counting and Turing universal cellular automaton Rule 110, the latter of which is, in principle, capable of implementing any computer algorithm. Information is stored in the nicks of DNA, and a secondary sequence-level encoding allows high-throughput sequencing-based readout. We conducted multiple rounds of computation on 4-bit data registers, as well as random access of data (selective access and erasure). We demonstrate that large strand displacement cascades with 244 distinct strand exchanges (sequential and in parallel) can use naturally occurring DNA sequence from M13 bacteriophage without stringent sequence design, which has the potential to improve the scale of computation and decrease cost. Our work merges DNA storage and DNA computing, setting the foundation of entirely molecular algorithms for parallel manipulation of digital information preserved in DNA.<more » « less
-
Access libraries such as ROOT[1] and HDF5[2] allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. For example, access libraries often implement buffering and data layout that assume that large, single-threaded sequential access patterns are causing less overall latency than small parallel random access: while this is true for spinning media, it is not true for flash media. The situation is getting worse with rapidly evolving storage devices such as non-volatile memory and ever larger datasets. This project explores distributed dataset mapping infrastructures that can integrate and scale out existing access libraries using Ceph’s extensible object model, avoiding re-implementation or even modifications of these access libraries as much as possible. These programmable storage extensions coupled with our distributed dataset mapping techniques enable: 1) access library operations to be offloaded to storage system servers, 2) the independent evolution of access libraries and storage systems and 3) fully leveraging of the existing load balancing, elasticity, and failure management of distributed storage systems like Ceph. They also create more opportunities to conduct storage server-local optimizations specific to storage servers. For example, storage servers might include local key/value stores combined with chunk stores that require different optimizations than a local file system. As storage servers evolve to support new storage devices like non-volatile memory, these server-local optimizations can be implemented while minimizing disruptions to applications. We will report progress on the means by which distributed dataset mapping can be abstracted over particular access libraries, including access libraries for ROOT data, and how we address some of the challenges revolving around data partitioning and composability of access operations.more » « less
-
null (Ed.)Chemical mixtures can be leveraged to store large amounts of data in a highly compact form and have the potential for massive scalability owing to the use of large-scale molecular libraries. With the parallelism that comes from having many species available, chemical-based memory can also provide the physical substrate for computation with increased throughput. Here, we represent non-binary matrices in chemical solutions and perform multiple matrix multiplications and additions, in parallel, using chemical reactions. As a case study, we demonstrate image processing, in which small greyscale images are encoded in chemical mixtures and kernel-based convolutions are performed using phenol acetylation reactions. In these experiments, we use the measured concentrations of reaction products (phenyl acetates) to reconstruct the output image. In addition, we establish the chemical criteria required to realize chemical image processing and validate reaction-based multiplication. Most importantly, this work shows that fundamental arithmetic operations can be reliably carried out with chemical reactions. Our approach could serve as a basis for developing more advanced chemical computing architectures.more » « less
-
Oxygen tolerant polymerizations including Photoinduced Electron/Energy Transfer-Reversible Addition–Fragmentation Chain-Transfer (PET-RAFT) polymerization allow for high-throughput synthesis of diverse polymer architectures on the benchtop in parallel. Recent developments have further increased throughput using liquid handling robotics to automate reagent handling and dispensing into well plates thus enabling the combinatorial synthesis of large polymer libraries. Although liquid handling robotics can enable automated polymer reagent dispensing in well plates, photoinitiation and reaction monitoring require automation to provide a platform that enables the reliable and robust synthesis of various polymer compositions in high-throughput where polymers with desired molecular weights and low dispersity are obtained. Here, we describe the development of a robotic platform to fully automate PET-RAFT polymerizations and provide individual control of reactions performed in well plates. On our platform, reagents are automatically dispensed in well plates, photoinitiated in individual wells with a custom-designed lightbox until the polymerizations are complete, and monitored online in real-time by tracking fluorescence intensities on a fluorescence plate reader, with well plate transfers between instruments occurring via a robotic arm. We found that this platform enabled robust parallel polymer synthesis of both acrylate and acrylamide homopolymers and copolymers, with high monomer conversions and low dispersity. The successful polymerizations obtained on this platform make it an efficient tool for combinatorial polymer chemistry. In addition, with the inclusion of machine learning protocols to help navigate the polymer space towards specific properties of interest, this robotic platform can ultimately become a self-driving lab that can dispense, synthesize, and monitor large polymer libraries.more » « less
An official website of the United States government
