A new approach to organize, edit and add data for DNA-based data storage

Jorge Eduardo Guerrero; Afsaneh Sadremomtaz; Reza Zadegan

With the expanding data storage capacity needs, DNA as an alternative to the archival storage medium offers potential advantages, including higher density and data retention for information storage1,2. However, the majority of DNA-based memory systems are write-once and read-only, although few studies have suggested overwriting digital data on the existing DNA using chemical modifications of bases 3. Using those strategies requires constantly updating the entire data coding and iteratively synthesizing the DNA pool. Therefore, considering the complexity and cost, those methods needed some amendments to become industrially scalable. Inspired by magnetic tapes4 and multisession-CD5, in this work, we created a DNA storage system coined the Molecular File System (MolFS), to organize, store, and edit digital information in a DNA pool. MolFS uses DNA pools that consist of multiple sessions, where each session contains data block and unique index sections to store and edit the files. We used indexes to describe the file system hierarchy, locate files along with the blocks, recognize the sessions, and identify the file versions. This approach reduces the editing cost compared to the state-of-the-art methods, and editing or adding data requires only synthesizing a new DNA pool containing the DNA session of the differential file. As proof of concept, we encoded 2.3 Kbytes of graphic and text data into 2 DNA pools. To edit the existing DNA pool, we added 8 new differential data blocks to existing pools, reaching 13.8 Kbytes of data stored from sessions 1 to 5. We performed nanopore sequencing and recovered the data from the MolFS sessions accurately and precisely.

More Like this