NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Everything You Always Wanted to Know About Secure and Private Database Systems (but were Afraid to Ask)

Sohn, Donghyun; Li, Xiling; Rogers, Jennie (March 2024, IEEE Data Engineering Bulletin)
Xiao, Xiaokui (Ed.)
Individuals and organizations are accumulating data at an unprecedented rate owing to the advent of inexpensive cloud computing. Data owners are increasingly turning to secure and privacy-preserving collaborative analytics to maximize the value of their records. In this paper, we will survey the state-of-the- art of this growing area. We will describe how researchers are bringing security and privacy-enhancing technologies, such as differential privacy, secure multiparty computation, and zero-knowledge proofs, into the query lifecycle. We also touch upon some of the challenges and opportunities associated with deploying these technologies in the field.
more » « less
Full Text Available
ZKSQL: Verifiable and Efficient Query Evaluation with Zero-Knowledge Proofs

https://doi.org/10.14778/3594512.3594513

Li, Xiling; Weng, Chenkai; Xu, Yongxin; Wang, Xiao; Rogers, Jennie (April 2023, Proceedings of the VLDB Endowment)

Individuals and organizations are using databases to store personal information at an unprecedented rate. This creates a quandary for data providers. They are responsible for protecting the privacy of individuals described in their database. On the other hand, data providers are sometimes required to provide statistics about their data instead of sharing it wholesale with strong assurances that these answers are correct and complete such as in regulatory filings for the US SEC and other goverment organizations. We introduce a system,ZKSQL, that provides authenticated answers to ad-hoc SQL queries with zero-knowledge proofs. Its proofs show that the answers are correct and sound with respect to the database's contents and they do not divulge any information about its input records. This system constructs proofs over the steps in a query's evaluation and it accelerates this process with authenticated set operations. We validate the efficiency of this approach over a suite of TPC-H queries and our results show that ZKSQL achieves two orders of magnitude speedup over the baseline.
more » « less
Full Text Available
Visualizing Privacy-Utility Trade-Offs in Differentially Private Data Releases

https://doi.org/10.2478/popets-2022-0058

Nanayakkara, Priyanka; Bater, Johes; He, Xi; Hullman, Jessica; Rogers, Jennie (March 2022, Proceedings on Privacy Enhancing Technologies)

Abstract Organizations often collect private data and release aggregate statistics for the public’s benefit. If no steps toward preserving privacy are taken, adversaries may use released statistics to deduce unauthorized information about the individuals described in the private dataset. Differentially private algorithms address this challenge by slightly perturbing underlying statistics with noise, thereby mathematically limiting the amount of information that may be deduced from each data release. Properly calibrating these algorithms—and in turn the disclosure risk for people described in the dataset—requires a data curator to choose a value for a privacy budget parameter, ɛ . However, there is little formal guidance for choosing ɛ , a task that requires reasoning about the probabilistic privacy–utility tradeoff. Furthermore, choosing ɛ in the context of statistical inference requires reasoning about accuracy trade-offs in the presence of both measurement error and differential privacy (DP) noise. We present Vi sualizing P rivacy (ViP), an interactive interface that visualizes relationships between ɛ , accuracy, and disclosure risk to support setting and splitting ɛ among queries. As a user adjusts ɛ , ViP dynamically updates visualizations depicting expected accuracy and risk. ViP also has an inference setting, allowing a user to reason about the impact of DP noise on statistical inferences. Finally, we present results of a study where 16 research practitioners with little to no DP background completed a set of tasks related to setting ɛ using both ViP and a control. We find that ViP helps participants more correctly answer questions related to judging the probability of where a DP-noised release is likely to fall and comparing between DP-noised and non-private confidence intervals.
more » « less
Full Text Available
SAQE: practical privacy-preserving approximate query processing for data federations

https://doi.org/10.14778/3407790.3407854

Bater, Johes; Park, Yongjoo; He, Xi; Wang, Xiao; Rogers, Jennie (August 2020, Proceedings of the VLDB Endowment)
null (Ed.)
Full Text Available
Privacy Changes Everything

https://doi.org/10.1007/978-3-030-33752-0_7

Rogers J., Bater J. (October 2019, Lecture notes in computer science)

We are storing and querying datasets with the private information of individuals at an unprecedented scale in settings ranging from IoT devices in smart homes to mining enormous collections of click trails for targeted advertising. Here, the privacy of the people described in these datasets is usually addressed as an afterthought, engineered on top of a DBMS optimized for performance. At best, these systems support security or managing access to sensitive data. This status quo has brought us a plethora of data breaches in the news. In response, governments are stepping in to enact privacy regulations such as the EU’s GDPR. We posit that there is an urgent need for trustworthy database system that offer end-to-end privacy guarantees for their records with user interfaces that closely resemble that of a relational database. As we shall see, these guarantees inform everything in the database’s design from how we store data to what query results we make available to untrusted clients. In this position paper we first define trustworthy database systems and put their research challenges in the context of relevant tools and techniques from the security community. We then use this backdrop to walk through the “life of a query” in a trustworthy database system. We start with the query parsing and follow the query’s path as the system plans, optimizes, and executes it. We highlight how we will need to rethink each step to make it efficient, robust, and usable for database clients.
more » « less
Full Text Available
Shrinkwrap: efficient SQL query processing in differentially private data federations

https://doi.org/10.14778/3291264.3291274

Bater, Johes; He, Xi; Ehrich, William; Machanavajjhala, Ashwin; Rogers, Jennie (November 2018, Proceedings of the VLDB Endowment)
null (Ed.)
A private data federation is a set of autonomous databases that share a unified query interface offering in-situ evaluation of SQL queries over the union of the sensitive data of its members. Owing to privacy concerns, these systems do not have a trusted data collector that can see all their data and their member databases cannot learn about individual records of other engines. Federations currently achieve this goal by evaluating queries obliviously using secure multiparty computation. This hides the intermediate result cardinality of each query operator by exhaustively padding it. With cascades of such operators, this padding accumulates to a blow-up in the output size of each operator and a proportional loss in query performance. Hence, existing private data federations do not scale well to complex SQL queries over large datasets. We introduce Shrinkwrap, a private data federation that offers data owners a differentially private view of the data held by others to improve their performance over oblivious query processing. Shrinkwrap uses computational differential privacy to minimize the padding of intermediate query results, achieving up to a 35X performance improvement over oblivious query processing. When the query needs differentially private output, Shrinkwrap provides a trade-off between result accuracy and query evaluation performance.
more » « less
Full Text Available

Search for: All records