We consider the problem of spaceefficiently estimating the number of simplices in a hypergraph stream. This is the most natural hypergraph generalization of the highlystudied problem of estimating the number of triangles in a graph stream. Our input is a kuniform hypergraph H with n vertices and m hyperedges, each hyperedge being a ksized subset of vertices. A ksimplex in H is a subhypergraph on k+1 vertices X such that all k+1 possible hyperedges among X exist in H. The goal is to process the hyperedges of H, which arrive in an arbitrary order as a data stream, and compute a good estimate of T_k(H), the number of ksimplices in H. We design a suite of algorithms for this problem. As with trianglecounting in graphs (which is the special case k = 2), sublinear space is achievable but only under a promise of the form T_k(H) ≥ T. Under such a promise, our algorithms use at most four passes and together imply a space bound of O(ε^{2} log δ^{1} polylog n ⋅ min{(m^{1+1/k})/T, m/(T^{2/(k+1)})}) for each fixed k ≥ 3, in order to guarantee an estimate within (1±ε)T_k(H) with probability ≥ 1δ. We also give a simpler 1pass algorithm thatmore »
Privately Answering Counting Queries with Generalized Gaussian Mechanisms
We give the first closedform privacy guarantees for the Generalized Gaussian mechanism (the mechanism that adds noise x to a vector with probability proportional to exp((x_p/σ)^p) for some σ, p), in the setting of answering k counting (i.e. sensitivity1) queries about a database with (ε, δ)differential privacy (in particular, with low 𝓁_∞error). Just using Generalized Gaussian noise, we obtain a mechanism such that if the true answers to the queries are the vector d, the mechanism outputs answers d̃ with the 𝓁_∞error guarantee: 𝔼[d̃  d_∞] = O(√{k log log k log(1/δ)}/ε). This matches the error bound of [Steinke and Ullman, 2017], but using a much simpler mechanism. By composing this mechanism with the sparse vector mechanism (generalizing a technique of [Steinke and Ullman, 2017]), we obtain a mechanism improving the √{k log log k} dependence on k to √{k log log log k}, Our main technical contribution is showing that certain powers of Generalized Gaussians, which follow a Generalized Gamma distribution, are subgamma. In subsequent work, the optimal 𝓁_∞error bound of O(√{k log (1/δ)}/ε) has been achieved by [Yuval Dagan and Gil Kur, 2020] and [Badih Ghazi et al., 2020] independently. However, the Generalized Gaussian mechanism has some qualitative more »
 Editors:
 Ligett, Katrina; Gupta, Swati
 Award ID(s):
 1816861
 Publication Date:
 NSFPAR ID:
 10253485
 Journal Name:
 Leibniz international proceedings in informatics
 Volume:
 192
 ISSN:
 18688969
 Sponsoring Org:
 National Science Foundation
More Like this


We introduce synchronization strings , which provide a novel way to efficiently deal with synchronization errors , i.e., insertions and deletions. Synchronization errors are strictly more general and much harder to cope with than more commonly considered Hammingtype errors , i.e., symbol substitutions and erasures. For every ε > 0, synchronization strings allow us to index a sequence with an ε O(1) size alphabet, such that one can efficiently transform k synchronization errors into (1 + ε)k Hammingtype errors . This powerful new technique has many applications. In this article, we focus on designing insdel codes , i.e., error correcting block codes (ECCs) for insertiondeletion channels. While ECCs for both Hammingtype errors and synchronization errors have been intensely studied, the latter has largely resisted progress. As Mitzenmacher puts it in his 2009 survey [30]: “ Channels with synchronization errors...are simply not adequately understood by current theory. Given the nearcomplete knowledge, we have for channels with erasures and errors...our lack of understanding about channels with synchronization errors is truly remarkable. ” Indeed, it took until 1999 for the first insdel codes with constant rate, constant distance, and constant alphabet size to be constructed and only since 2016 are there constructions ofmore »

We introduce synchronization strings, which provide a novel way of efficiently dealing with synchronization errors, i.e., insertions and deletions. Synchronization errors are strictly more general and much harder to deal with than more commonly considered halferrors, i.e., symbol corruptions and erasures. For every ε > 0, synchronization strings allow to index a sequence with an εO(1) size alphabet such that one can efficiently transform k synchronization errors into (1 + ε)k halferrors. This powerful new technique has many applications. In this paper we focus on designing insdel codes, i.e., error correcting block codes (ECCs) for insertion deletion channels. While ECCs for both halferrors and synchronization errors have been intensely studied, the later has largely resisted progress. As Mitzenmacher puts it in his 2009 survey: "Channels with synchronization errors ... are simply not adequately understood by current theory. Given the nearcomplete knowledge we have for channels with erasures and errors ... our lack of understanding about channels with synchronization errors is truly remarkable." Indeed, it took until 1999 for the first insdel codes with constant rate, constant distance, and constant alphabet size to be constructed and only since 2016 are there constructions of constant rate indel codes for asymptotically large noisemore »

In practice, differentially private data releases are designed to support a variety of applications. A data release is fit for use if it meets target accuracy requirements for each application. In this paper, we consider the problem of answering linear queries under differential privacy subject to perquery accuracy constraints. Existing practical frameworks like the matrix mechanism do not provide such finegrained control (they optimize total error, which allows some query answers to be more accurate than necessary, at the expense of other queries that become no longer useful). Thus, we design a fitnessforuse strategy that adds privacypreserving Gaussian noise to query answers. The covariance structure of the noise is optimized to meet the finegrained accuracy requirements while minimizing the cost to privacy.

We design a nonadaptive algorithm that, given a Boolean function f: {0, 1}^n → {0, 1} which is αfar from monotone, makes poly(n, 1/α) queries and returns an estimate that, with high probability, is an Otilde(\sqrt{n})approximation to the distance of f to monotonicity. Furthermore, we show that for any constant k > 0, approximating the distance to monotonicity up to n^(1/2−k)factor requires 2^{n^k} nonadaptive queries, thereby ruling out a poly(n, 1/α)query nonadaptive algorithm for such approximations. This answers a question of Seshadhri (Property Testing Review, 2014) for the case of nonadaptive algorithms. Approximating the distance to a property is closely related to tolerantly testing that property. Our lower bound stands in contrast to standard (nontolerant) testing of monotonicity that can be done nonadaptively with Otilde(n/ε^2) queries. We obtain our lower bound by proving an analogous bound for erasureresilient testers. An αerasureresilient tester for a desired property gets oracle access to a function that has at most an α fraction of values erased. The tester has to accept (with probability at least 2/3) if the erasures can be filled in to ensure that the resulting function has the property and to reject (with probability at least 2/3) if every completion ofmore »