skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM to 12:00 PM ET on Tuesday, March 25 due to maintenance. We apologize for the inconvenience.


Title: Lethe: Secure Deletion by Addition
Modern data privacy regulations such as GDPR, CCPA, and CDPA stipulate that data pertaining to a user must be deleted without undue delay upon the user’s request. Existing systems are not designed to comply with these regulations and can leave traces of deleted data for indeterminate periods of time, often as long as months. We developed Lethe to address these problems by providing fine-grained secure deletion on any system and any storage medium, provided that Lethe has access to a fixed, small amount of securely-deletable storage. Lethe achieves this using keyed hash forests (KHFs), extensions of keyed hash trees (KHTs), structured to serve as efficient representations of encryption key hierarchies. By using a KHF as a regulator for data access, Lethe provides its secure deletion not by removing the KHF, but by adding a new KHF that only grants access to still-valid data. Access to the previous KHF is lost, and the data it regulated securely deleted, through the secure deletion of the single key that protected the previous KHF.  more » « less
Award ID(s):
2106263 2106259 1841545
PAR ID:
10435003
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
3rd Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems (CHEOPS ’23)
Page Range / eLocation ID:
1 to 8
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Flash memory has been used extensively as external storage of smartphones, tablets, IoT devices, laptops, etc. Therefore, more and more sensitive or even mission critical data are stored in flash and, once the data turn obsolete, securely deleting them is necessary for both regulation compliance and privacy protection. Traditional secure deletion on flash memory mainly focuses on sanitizing data. However, unique nature of flash memory may cause various data ``remnants'' and, even though the data are removed, the remnants may be utilized by the adversary to recover the deleted data, compromising the secure deletion guarantee. Based on both theoretic analysis and experiments using real-world workloads, we have identified one common type of remnants in the flash memory, namely duplicates, which are caused by unique internal functions of flash storage media including garbage collection, wear leveling, bad block management. We propose RedFlash, a novel secure deletion scheme which can efficiently Remove both the data and the corresponding duplicates towards secure deletion on Flash memory. Security analysis and experimental evaluation show that RedFlash can ensure the secure deletion guarantee, at the cost of a small performance degradation, compared to a regular (non-secure) flash controller. 
    more » « less
  2. Data-intensive applications have fueled the evolution oflog-structured merge (LSM)based key-value engines that employ theout-of-placeparadigm to support high ingestion rates with low read/write interference. These benefits, however, come at the cost oftreating deletes as second-class citizens. A delete operation inserts atombstonethat invalidates older instances of the deleted key. State-of-the-art LSM-engines do not provide guarantees as to how fast a tombstone will propagate topersist the deletion. Further, LSM-engines only support deletion on the sort key. To delete on another attribute (e.g., timestamp), the entire tree is read and re-written, leading to undesired latency spikes and increasing the overall operational cost of a database. Efficient and persistent deletion is key to support: (i) streaming systems operating on a window of data, (ii) privacy with latency guarantees on data deletion, and (iii)en massecloud deployment of data systems. Further, we document that LSM-based key-value engines perform suboptimally in the presence of deletes in a workload. Tombstone-driven logical deletes, by design, are unable to purge the deleted entries in a timely manner, and retaining the invalidated entries perpetually affects the overall performance of LSM-engines in terms of space amplification, write amplification, and read performance. Moreover, the potentially unbounded latency for persistent deletes brings in critical privacy concerns in light of the data privacy protection regulations, such as theright to be forgottenin EU’s GDPR, theright to deletein California’s CCPA and CPRA, anddeletion rightin Virginia’s VCDPA. Toward this, we introduce the delete design space for LSM-trees and highlight the performance implications of the different classes of delete operations. To address these challenges, in this article, we build a new key-value storage engine,Lethe+, that uses a very small amount of additional metadata, a set of new delete-aware compaction policies, and a new physical data layout that weaves the sort and the delete key order. We show thatLethe+supports any user-defined threshold for the delete persistence latency offeringhigher read throughput(1.17× -1.4×) andlower space amplification(2.1× -9.8×), with a modest increase in write amplification (between 4% and 25%) that can be further amortized to less than 1%. In addition,Lethe+supports efficient range deletes on asecondary delete keyby dropping entire data pages without sacrificing read performance or employing a costly full tree merge. 
    more » « less
  3. This paper presents a novel framework for creating a recoverable rare disease patient identity system using blockchain and smart contracts, decentralized identifiers (DIDs), and the InterPlanetary File System (IPFS). Smart contracts are executable code that can be written into decentralized storage such as blockchains in order to enable tamper-proof transactions of data. DIDs provide a secure, decentralized, and extensible way to create, store, and manage digital identities, while IPFS provides a distributed, immutable, and secure storage system for patient identities. Utilizing these technologies with smart contracts, we created a framework to store persistent medical records of patients. Smart contracts additionally allow account recovery without the use of any centralized authority. The framework enables healthcare providers to securely access a patient's data while maintaining the patient's ownership of their data. The paper explores the advantages of using a decentralized identity system and highlights the potential of this approach to improve the security and universality of medical records for patients with rare diseases. 
    more » « less
  4. null (Ed.)
    Data-intensive applications fueled the evolution of log structured merge (LSM) based key-value engines that employ the out-of-place paradigm to support high ingestion rates with low read/write interference. These benefits, however, come at the cost of treating deletes as a second-class citizen. A delete inserts a tombstone that invalidates older instances of the deleted key. State-of-the-art LSM engines do not provide guarantees as to how fast a tombstone will propagate to persist the deletion. Further, LSM engines only support deletion on the sort key. To delete on another attribute (e.g., timestamp), the entire tree is read and re-written. We highlight that fast persistent deletion without affecting read performance is key to support: (i) streaming systems operating on a window of data, (ii) privacy with latency guarantees on the right-to-be-forgotten, and (iii) en masse cloud deployment of data systems that makes storage a precious resource. To address these challenges, in this paper, we build a new key-value storage engine, Lethe, that uses a very small amount of additional metadata, a set of new delete-aware compaction policies, and a new physical data layout that weaves the sort and the delete key order. We show that Lethe supports any user-defined threshold for the delete persistence latency offering higher read throughput (1.17-1.4x) and lower space amplification (2.1-9.8x), with a modest increase in write amplification (between 4% and 25%). In addition, Lethe supports efficient range deletes on a secondary delete key by dropping entire data pages without sacrificing read performance nor employing a costly full tree merge. 
    more » « less
  5. Due to the intrinsic properties of Solid-State Drives (SSDs), invalid data remain in SSDs before erased by a garbage collection process, which increases the risk of being attacked by adversaries. Previous studies use erase and cryptography based schemes to purposely delete target data but face extremely large overhead. In this paper, we propose a Workload-Aware Secure Deletion scheme, called WAS-Deletion, to reduce the overhead of secure deletion by three major components. First, the WASDeletion scheme efficiently splits invalid and valid data into different blocks based on workload characteristics. Second, the WAS-Deletion scheme uses a new encryption allocation scheme, making the encryption follow the same direction as the write on multiple blocks and vertically encrypts pages with the same key in one block. Finally, a new adaptive scheduling scheme can dynamically change the configurations of different regions to further reduce secure deletion overhead based on the current workload. The experimental results indicate that the newly proposed WAS-Deletion scheme can reduce the secure deletion cost by about 1.2x to 12.9x compared to previous studies. 
    more » « less