skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, June 13 until 2:00 AM ET on Friday, June 14 due to maintenance. We apologize for the inconvenience.


Title: Triangle Counting with Local Edge Differential Privacy
Many deployments of differential privacy in industry are in the local model, where each party releases its private information via a differentially private randomizer. We study triangle counting in the noninteractive and interactive local model with edge differential privacy (that, intuitively, requires that the outputs of the algorithm on graphs that differ in one edge be indistinguishable). In this model, each party’s local view consists of the adjacency list of one vertex. In the noninteractive model, we prove that additive Ω(n²) error is necessary, where n is the number of nodes. This lower bound is our main technical contribution. It uses a reconstruction attack with a new class of linear queries and a novel mix-and-match strategy of running the local randomizers with different completions of their adjacency lists. It matches the additive error of the algorithm based on Randomized Response, proposed by Imola, Murakami and Chaudhuri (USENIX2021) and analyzed by Imola, Murakami and Chaudhuri (CCS2022) for constant ε. We use a different postprocessing of Randomized Response and provide tight bounds on the variance of the resulting algorithm. In the interactive setting, we prove a lower bound of Ω(n^{3/2}) on the additive error. Previously, no hardness results were known for interactive, edge-private algorithms in the local model, except for those that follow trivially from the results for the central model. Our work significantly improves on the state of the art in differentially private graph analysis in the local model.  more » « less
Award ID(s):
2120667 2232694
NSF-PAR ID:
10492418
Author(s) / Creator(s):
; ; ;
Editor(s):
Etessami, Kousha; Feige, Uriel; Puppis, Gabriele
Publisher / Repository:
Leibniz International Proceedings in Informatics (LIPIcs):50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)
Date Published:
Subject(s) / Keyword(s):
["local differential privacy","reconstruction attacks","lower bounds","triangle counting","Theory of computation → Problems, reductions and completeness"]
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Many deployments of differential privacy in industry are in the local model, where each party releases its private information via a differentially private randomizer. We study triangle counting in the noninteractive and interactive local model with edge differential privacy (that, intuitively, requires that the outputs of the algorithm on graphs that differ in one edge be indistinguishable). In this model, each party’s local view consists of the adjacency list of one vertex. In the noninteractive model, we prove that additive Ω(n^2) error is necessary, where n is the number of nodes. This lower bound is our main technical contribution. It uses a reconstruction attack with a new class of linear queries and a novel mix-and-match strategy of running the local randomizers with different completions of their adjacency lists. It matches the additive error of the algorithm based on Randomized Response, proposed by Imola, Murakami and Chaudhuri (USENIX2021) and analyzed by Imola, Murakami and Chaudhuri (CCS2022) for constant ε. We use a different postprocessing of Randomized Response and provide tight bounds on the variance of the resulting algorithm. In the interactive setting, we prove a lower bound of Ω(n3/2) on the additive error. Previously, no hardness results were known for interactive, edge-private algorithms in the local model, except for those that follow trivially from the results for the central model. Our work significantly improves on the state of the art in differentially private graph analysis in the local model. 
    more » « less
  2. We consider the problem of designing and analyzing differentially private algorithms that can be implemented on discrete models of computation in strict polynomial time, motivated by known attacks on floating point implementations of real-arithmetic differentially private algorithms (Mironov, CCS 2012) and the potential for timing attacks on expected polynomial-time algorithms. As a case study, we examine the basic problem of approximating the histogram of a categorical dataset over a possibly large data universe X. The classic Laplace Mechanism (Dwork, McSherry, Nissim, Smith, TCC 2006 and J. Privacy \& Confidentiality 2017) does not satisfy our requirements, as it is based on real arithmetic, and natural discrete analogues, such as the Geometric Mechanism (Ghosh, Roughgarden, Sundarajan, STOC 2009 and SICOMP 2012), take time at least linear in |X|, which can be exponential in the bit length of the input.   In this paper, we provide strict polynomial-time discrete algorithms for approximate histograms whose simultaneous accuracy (the maximum error over all bins) matches that of the Laplace Mechanism up to constant factors, while retaining the same (pure) differential privacy guarantee. One of our algorithms produces a sparse histogram as output. Its ``"per-bin accuracy" (the error on individual bins) is worse than that of the Laplace Mechanism by a factor of log|X|, but we prove a lower bound showing that this is necessary for any algorithm that produces a sparse histogram. A second algorithm avoids this lower bound, and matches the per-bin accuracy of the Laplace Mechanism, by producing a compact and efficiently computable representation of a dense histogram; it is based on an (n+1)-wise independent implementation of an appropriately clamped version of the Discrete Geometric Mechanism. 
    more » « less
  3. Many problems on data streams have been studied at two extremes of difficulty: either allowing randomized algorithms, in the static setting (where they should err with bounded probability on the worst case stream); or when only deterministic and infallible algorithms are required. Some recent works have considered the adversarial setting, in which a randomized streaming algorithm must succeed even on data streams provided by an adaptive adversary that can see the intermediate outputs of the algorithm. In order to better understand the differences between these models, we study a streaming task called “Missing Item Finding”. In this problem, for r < n, one is given a data stream a1 , . . . , ar of elements in [n], (possibly with repetitions), and must output some x ∈ [n] which does not equal any of the ai. We prove that, for r = nΘ(1) and δ = 1/poly(n), the space required for randomized algorithms that solve this problem in the static setting with error δ is Θ(polylog(n)); for algorithms in the adversarial setting with error δ, Θ((1 + r2/n)polylog(n)); and for deterministic algorithms, Θ(r/polylog(n)). Because our adversarially robust algorithm relies on free access to a string of O(r log n) random bits, we investigate a “random start” model of streaming algorithms where all random bits used are included in the space cost. Here we find a conditional lower bound on the space usage, which depends on the space that would be needed for a pseudo-deterministic algorithm to solve the problem. We also prove an Ω(r/polylog(n)) lower bound for the space needed by a streaming algorithm with < 1/2polylog(n) error against “white-box” adversaries that can see the internal state of the algorithm, but not predict its future random decisions. 
    more » « less
  4. There has been a flurry of recent literature studying streaming algorithms for which the input stream is chosen adaptively by a black-box adversary who observes the output of the streaming algorithm at each time step. However, these algorithms fail when the adversary has access to the internal state of the algorithm, rather than just the output of the algorithm. We study streaming algorithms in the white-box adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible. We first give a randomized algorithm for the L1-heavy hitters problem that outperforms the optimal deterministic Misra-Gries algorithm on long streams. If the white-box adversary is computationally bounded, we use cryptographic techniques to reduce the memory of our L1-heavy hitters algorithm even further and to design a number of additional algorithms for graph, string, and linear algebra problems. The existence of such algorithms is surprising, as the streaming algorithm does not even have a secret key in this model, i.e., its state is entirely known to the adversary. One algorithm we design is for estimating the number of distinct elements in a stream with insertions and deletions achieving a multiplicative approximation and sublinear space; such an algorithm is impossible for deterministic algorithms. We also give a general technique that translates any two-player deterministic communication lower bound to a lower bound for randomized algorithms robust to a white-box adversary. In particular, our results show that for all p ≥ 0, there exists a constant Cp > 1 such that any Cp-approximation algorithm for Fp moment estimation in insertion-only streams with a white-box adversary requires Ω(n) space for a universe of size n. Similarly, there is a constant C > 1 such that any C-approximation algorithm in an insertion-only stream for matrix rank requires Ω(n) space with a white-box adversary. These results do not contradict our upper bounds since they assume the adversary has unbounded computational power. Our algorithmic results based on cryptography thus show a separation between computationally bounded and unbounded adversaries. Finally, we prove a lower bound of Ω(log n) bits for the fundamental problem of deterministic approximate counting in a stream of 0’s and 1’s, which holds even if we know how many total stream updates we have seen so far at each point in the stream. Such a lower bound for approximate counting with additional information was previously unknown, and in our context, it shows a separation between multiplayer deterministic maximum communication and the white-box space complexity of a streaming algorithm 
    more » « less
  5. Local differential privacy is a widely studied restriction on distributed algorithms that collect aggregates about sensitive user data, and is now deployed in several large systems. We initiate a systematic study of a fundamental limitation of locally differentially private protocols: they are highly vulnerable to adversarial manipulation. While any algorithm can be manipulated by adversaries who lie about their inputs, we show that any noninteractive locally differentially private protocol can be manipulated to a much greater extent---when the privacy level is high, or the domain size is large, a small fraction of users in the protocol can completely obscure the distribution of the honest users' input. We also construct protocols that are optimally robust to manipulation for a variety of common tasks in local differential privacy. Finally, we give simple experiments validating our  theoretical results, and demonstrating that protocols that are optimal without manipulation can have dramatically different levels of robustness to manipulation. Our results suggest caution when deploying local differential privacy and reinforce the importance of efficient cryptographic  techniques for the distributed emulation of centrally differentially private mechanisms. 
    more » « less