skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Optimally hiding object sizes with constrained padding
Among the most challenging traffic-analysis attacks to confound are those leveraging the sizes of objects downloaded over the network. In this paper we systematically analyze this problem under realistic constraints regarding the padding overhead that the object store is willing to incur. We give algorithms to compute privacy-optimal padding schemes—specifically that minimize the network observer’s information gain from a downloaded object’s padded size—in several scenarios of interest: per-object padding, in which the object store responds to each request for an object with the same padded copy; per-request padding, in which the object store pads an object anew each time it serves that object; and a scenario unlike the previous ones in that the object store is unable to leverage a known distribution over the object queries. We provide constructions for privacy-optimal padding in each case, compare them to recent contenders in the research literature, and evaluate their performance on practical datasets.  more » « less
Award ID(s):
2207214
PAR ID:
10438806
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE Computer Security Foundations Symposium
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The sizes of objects retrieved over the network are powerful indicators of the objects retrieved and are ingredients in numerous types of traffic analysis, such as webpage fingerprinting. We present an algorithm by which a benevolent object store computes a memoryless padding scheme to pad objects before sending them, in a way that bounds the information gain that the padded sizes provide to the network observer about the objects being retrieved. Moreover, our algorithm innovates over previous works in two critical ways. First, the computed padding scheme satisfies constraints on the padding overhead: no object is padded to more than c× its original size, for a tunable factor c > 1. Second, the privacy guarantees of the padding scheme allow for object retrievals that are not independent, as could be caused by hyperlinking. We show in empirical tests that our padding schemes improve dramatically over previous schemes for padding dependent object retrievals, providing better privacy at substantially lower padding overhead, and over known techniques for padding independent object retrievals subject to padding overhead constraints. 
    more » « less
  2. Among the most challenging traffic-analysis attacks to confound are those leveraging the sizes of objects downloaded over the network, as the size of an object is often a powerful indicator of its identity. In this dissertation, we consider this challenge in both (i) the simplified setting where successive object retrievals are assumed to be independent and (ii) the setting where sequential object retrievals are dependent on one another. Furthermore, within the dependent retrievals setting, we address the scenario where enumerating all possible sequences is impractical. For each setting, we present algorithms by which a benevolent object store computes a memoryless padding scheme to pad objects before sending them, in a way that bounds the information gain that the padded sizes provide to the network observer about the objects being retrieved. Furthermore, all of our algorithms ensure that no object is padded to more than c× its original size, for a tunable factor c > 1. We compare each algorithm to recent contenders in the research literature and evaluate their performance on practical datasets. 
    more » « less
  3. The notion of Local Differential Privacy (LDP) enables users to respond to sensitive questions while preserving their privacy. The basic LDP frequent oracle (FO) protocol enables an aggregator to estimate the frequency of any value. But when each user has a set of values, one needs an additional padding and sampling step to find the frequent values and estimate their frequencies. In this paper, we formally define such padding and sample based frequency oracles (PSFO). We further identify the privacy amplification property in PSFO. As a result, we propose SVIM, a protocol for finding frequent items in the set-valued LDP setting. Experiments show that under the same privacy guarantee and computational cost, SVIM significantly improves over existing methods. With SVIM to find frequent items, we propose SVSM to effectively find frequent itemsets, which to our knowledge has not been done before in the LDP setting. 
    more » « less
  4. This paper formulates the cache-aided multi-user Private Information Retrieval (MuPIR) problem, including K u cache-equipped users, each of which wishes to retrieve a desired message efficiently from N distributed databases with access to K independent messages. Privacy of the users’ demands requires that any individual database can not learn anything about the demands of the users. The load of this problem is defined as the average number of downloaded bits per desired message bit. The goal is to find the optimal memory-load trade-off while preserving the demand privacy. Besides the formulation of the MuPIR problem, the contribution of this paper is two-fold. First, we characterize the optimal memory-load trade-off for a system with N = 2 databases, K = 2 messages and K u = 2 users demanding distinct messages; Second, a product design with order optimality guarantee is proposed. In addition, the product design can achieve the optimal load when the cache memory is large enough. The product design embeds the well-known Sun-Jafar PIR scheme into coded caching, in order to benefit from the coded caching gain while preserving the privacy of the users’ demands. 
    more » « less
  5. null (Ed.)
    A private data federation is a set of autonomous databases that share a unified query interface offering in-situ evaluation of SQL queries over the union of the sensitive data of its members. Owing to privacy concerns, these systems do not have a trusted data collector that can see all their data and their member databases cannot learn about individual records of other engines. Federations currently achieve this goal by evaluating queries obliviously using secure multiparty computation. This hides the intermediate result cardinality of each query operator by exhaustively padding it. With cascades of such operators, this padding accumulates to a blow-up in the output size of each operator and a proportional loss in query performance. Hence, existing private data federations do not scale well to complex SQL queries over large datasets. We introduce Shrinkwrap, a private data federation that offers data owners a differentially private view of the data held by others to improve their performance over oblivious query processing. Shrinkwrap uses computational differential privacy to minimize the padding of intermediate query results, achieving up to a 35X performance improvement over oblivious query processing. When the query needs differentially private output, Shrinkwrap provides a trade-off between result accuracy and query evaluation performance. 
    more » « less