skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Practical Security and Privacy for Database Systems
Computing technology has enabled massive digital traces of our personal lives to be collected and stored. These datasets play an important role in numerous real-life applications and research analysis, such as contact tracing for COVID 19, but they contain sensitive information about individuals. When managing these datasets, privacy is usually addressed as an afterthought, engineered on top of a database system optimized for performance and usability. This has led to a plethora of unexpected privacy attacks in the news. Specialized privacy-preserving solutions usually require a group of privacy experts and they are not directly transferable to other domains. There is an urgent need for a generally trustworthy database system that offers end-to-end security and privacy guarantees. In this tutorial, we will first describe the security and privacy requirements for database systems in different settings and cover the state-of-the-art tools that achieve these requirements. We will also show challenges in integrating these techniques together and demonstrate the design principles and optimization opportunities for these security and privacy-aware database systems.  more » « less
Award ID(s):
2016240
PAR ID:
10343043
Author(s) / Creator(s):
Date Published:
Journal Name:
IGMOD '21: Proceedings of the 2021 International Conference on Management of Data
Page Range / eLocation ID:
2839–2845
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We are storing and querying datasets with the private information of individuals at an unprecedented scale in settings ranging from IoT devices in smart homes to mining enormous collections of click trails for targeted advertising. Here, the privacy of the people described in these datasets is usually addressed as an afterthought, engineered on top of a DBMS optimized for performance. At best, these systems support security or managing access to sensitive data. This status quo has brought us a plethora of data breaches in the news. In response, governments are stepping in to enact privacy regulations such as the EU’s GDPR. We posit that there is an urgent need for trustworthy database system that offer end-to-end privacy guarantees for their records with user interfaces that closely resemble that of a relational database. As we shall see, these guarantees inform everything in the database’s design from how we store data to what query results we make available to untrusted clients. In this position paper we first define trustworthy database systems and put their research challenges in the context of relevant tools and techniques from the security community. We then use this backdrop to walk through the “life of a query” in a trustworthy database system. We start with the query parsing and follow the query’s path as the system plans, optimizes, and executes it. We highlight how we will need to rethink each step to make it efficient, robust, and usable for database clients. 
    more » « less
  2. We show that it is possible to achieve information theoretic location privacy for secondary users (SUs) in database-driven cognitive radio networks (CRNs) with an end-to-end delay less than a second, which is significantly better than that of the existing alternatives offering only a computational privacy. This is achieved based on a keen observation that, by the requirement of Federal Communications Commission (FCC), all certified spectrum databases synchronize their records. Hence, the same copy of spectrum database is available through multiple (distinct) providers. We harness the synergy between multi-server private information retrieval (PIR) and database-driven CRN architecture to offer an optimal level of privacy with high efficiency by exploiting this observation. We demonstrated, analytically and experimentally with deployments on actual cloud systems that, our adaptations of multi-server PIR outperform that of the (currently) fastest single-server PIR by a magnitude of times with information-theoretic security, collusion resiliency, and fault-tolerance features. Our analysis indicates that multi-server PIR is an ideal cryptographic tool to provide location privacy in database-driven CRNs, in which the requirement of replicated databases is a natural part of the system architecture, and therefore SUs can enjoy all advantages of multi-server PIR without any additional architectural and deployment costs. 
    more » « less
  3. Data privacy requirements are a complex and quickly evolving part of the data management domain. Especially in Healthcare (e.g., United States Health Insurance Portability and Accountability Act and Veterans Affairs requirements), there has been a strong emphasis on data privacy and protection. Data storage is governed by multiple sources of policy requirements, including internal policies and legal requirements imposed by external governing organizations. Within a database, a single value can be subject to multiple requirements on how long it must be preserved and when it must be irrecoverably destroyed. This often results in a complex set of overlapping and potentially conflicting policies. Existing storage systems are lacking sufficient support functionality for these critical and evolving rules, making compliance an underdeveloped aspect of data management. As a result, many organizations must implement manual ad-hoc solutions to ensure compliance. As long as organizations depend on manual approaches, there is an increased risk of non-compliance and threat to customer data privacy. In this paper, we detail and implement an automated comprehensive data management compliance framework facilitating retention and purging compliance within a database management system. This framework can be integrated into existing databases without requiring changes to existing business processes. Our proposed implementation uses SQL to set policies and automate compliance. We validate this framework on a Postgres database, and measure the factors that contribute to our reasonable performance overhead (13% in a simulated real-world workload). 
    more » « less
  4. Transactive energy systems (TES) are emerging as a transformative solution for the problems that distribution system operators face due to an increase in the use of distributed energy resources and rapid growth in scalability of managing active distribution system (ADS). On the one hand, these changes pose a decentralized power system control problem, requiring strategic control to maintain reliability and resiliency for the community and for the utility. On the other hand, they require robust financial markets while allowing participation from diverse prosumers. To support the computing and flexibility requirements of TES while preserving privacy and security, distributed software platforms are required. In this paper, we enable the study and analysis of security concerns by developing Transactive Energy Security Simulation Testbed (TESST), a TES testbed for simulating various cyber attacks. In this work, the testbed is used for TES simulation with centralized clearing market, highlighting weaknesses in a centralized system. Additionally, we present a blockchain enabled decentralized market solution supported by distributed computing for TES, which on one hand can alleviate some of the problems that we identify, but on the other hand, may introduce newer issues. Future study of these differing paradigms is necessary and will continue as we develop our security simulation testbed. 
    more » « less
  5. The pervasive use of databases for the storage of critical and sensitive information in many organizations has led to an increase in the rate at which databases are exploited in computer crimes. While there are several techniques and tools available for database forensic analysis, such tools usually assume an apriori database preparation, such as relying on tamper-detection software to already be in place and the use of detailed logging. Further, such tools are built-in and thus can be compromised or corrupted along with the database itself. In practice, investigators need forensic and security audit tools that work on poorlyconfigured systems and make no assumptions about the extent of damage or malicious hacking in a database. In this paper, we present our database forensics methods, which are capable of examining database content from a storage (disk or RAM) image without using any log or file system metadata. We describe how these methods can be used to detect security breaches in an untrusted environment where the security threat arose from a privileged user (or someone who has obtained such privileges). Finally, we argue that a comprehensive and independent audit framework is necessary in order to detect and counteract threats in an environment where the security breach originates from an administrator (either at database or operating system level). 
    more » « less