skip to main content


Search for: All records

Award ID contains: 1640374

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. When collecting information, local differential privacy (LDP) alleviates privacy concerns of users because their private information is randomized before being sent it to the central aggregator. LDP imposes large amount of noise as each user executes the randomization independently. To address this issue, recent work introduced an intermediate server with the assumption that this intermediate server does not collude with the aggregator. Under this assumption, less noise can be added to achieve the same privacy guarantee as LDP, thus improving utility for the data collection task. This paper investigates this multiple-party setting of LDP. We analyze the system model and identify potential adversaries. We then make two improvements: a new algorithm that achieves a better privacy-utility tradeoff; and a novel protocol that provides better protection against various attacks. Finally, we perform experiments to compare different methods and demonstrate the benefits of using our proposed method. 
    more » « less
  2. When collecting information, local differential privacy (LDP) relieves the concern of privacy leakage from users' perspective, as user's private information is randomized before sent to the aggregator. We study the problem of recovering the distribution over a numerical domain while satisfying LDP. While one can discretize a numerical domain and then apply the protocols developed for categorical domains, we show that taking advantage of the numerical nature of the domain results in better trade-off of privacy and utility. We introduce a new reporting mechanism, called the square wave (SW) mechanism, which exploits the numerical nature in reporting. We also develop an Expectation Maximization with Smoothing (EMS) algorithm, which is applied to aggregated histograms from the SW mechanism to estimate the original distributions. Extensive experiments demonstrate that our proposed approach, SW with EMS, consistently outperforms other methods in a variety of utility metrics. 
    more » « less
  3. Local Differential Privacy (LDP) protects user privacy from the data collector. LDP protocols have been increasingly deployed in the industry. A basic building block is frequency oracle (FO) protocols, which estimate frequencies of values. While several FO protocols have been proposed, the design goal does not lead to optimal results for answering many queries. In this paper, we show that adding post-processing steps to FO protocols by exploiting the knowledge that all individual frequencies should be non-negative and they sum up to one can lead to significantly better accuracy for a wide range of tasks, including frequencies of individual values, frequencies of the most frequent values, and frequencies of subsets of values. We consider 10 different methods that exploit this knowledge differently. We establish theoretical relationships between some of them and conducted extensive experimental evaluations to understand which methods should be used for different query tasks. 
    more » « less
  4. Multi-dimensional analytical (MDA) queries are often issued against a fact table with predicates on (categorical or ordinal) dimensions and aggregations on one or more measures. In this paper, we study the problem of answering MDA queries under local differential privacy (LDP). In the absence of a trusted agent, sensitive dimensions are encoded in a privacy preserving (LDP) way locally before being sent to the data collector. The data collector estimates the answers to MDA queries, based on the encoded dimensions. We propose several LDP encoders and estimation algorithms, to handle a large class of MDA queries with different types of predicates and aggregation functions. Our techniques are able to answer these queries with tight error bounds and scale well in high-dimensional settings (i.e., error is polylogarithmic in dimension sizes). We conduct experiments on real and synthetic data to verify our theoretical results, and compare our solution with marginal-estimation based solutions. 
    more » « less
  5. Marginal tables are the workhorse of capturing the correlations among a set of attributes. We consider the problem of constructing marginal tables given a set of user’s multi-dimensional data while satisfying Local Differential Privacy (LDP), a privacy notion that protects individual user’s privacy without relying on a trusted third party. Existing works on this problem perform poorly in the high-dimensional setting; even worse, some incur very expensive computational overhead. In this paper, we propose CALM, Consistent Adaptive Local Marginal, that takes advantage of the careful challenge analysis and performs consistently better than existing methods. More importantly, CALM can scale well with large data dimensions and marginal sizes. We conduct extensive experiments on several real world datasets. Experimental results demonstrate the effectiveness and efficiency of CALM over existing methods. 
    more » « less
  6. The notion of Local Differential Privacy (LDP) enables users to respond to sensitive questions while preserving their privacy. The basic LDP frequent oracle (FO) protocol enables an aggregator to estimate the frequency of any value. But when each user has a set of values, one needs an additional padding and sampling step to find the frequent values and estimate their frequencies. In this paper, we formally define such padding and sample based frequency oracles (PSFO). We further identify the privacy amplification property in PSFO. As a result, we propose SVIM, a protocol for finding frequent items in the set-valued LDP setting. Experiments show that under the same privacy guarantee and computational cost, SVIM significantly improves over existing methods. With SVIM to find frequent items, we propose SVSM to effectively find frequent itemsets, which to our knowledge has not been done before in the LDP setting. 
    more » « less
  7. Protocols satisfying Local Differential Privacy (LDP) enable parties to collect aggregate information about a population while protecting each user’s privacy, without relying on a trusted third party. LDP protocols (such as Google’s RAPPOR) have been deployed in real-world scenarios. In these protocols, a user encodes his private information and perturbs the encoded value locally before sending it to an aggregator, who combines values that users contribute to infer statistics about the population. In this paper, we introduce a framework that generalizes several LDP protocols proposed in the literature. Our framework yields a simple and fast aggregation algorithm, whose accuracy can be precisely analyzed. Our in-depth analysis enables us to choose optimal parameters, resulting in two new protocols (i.e., Optimized Unary Encoding and Optimized Local Hashing) that provide better utility than protocols previously proposed. We present precise conditions for when each proposed protocol should be used, and perform experiments that demonstrate the advantage of our proposed protocols. 
    more » « less