There has been a flurry of recent literature studying streaming algorithms for which the input stream is chosen adaptively by a blackbox adversary who observes the output of the streaming algorithm at each time step. However, these algorithms fail when the adversary has access to the internal state of the algorithm, rather than just the output of the algorithm. We study streaming algorithms in the whitebox adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible. We first give a randomized algorithm for the L1heavy hitters problem that outperforms the optimal deterministic MisraGries algorithm on long streams. If the whitebox adversary is computationally bounded, we use cryptographic techniques to reduce the memory of our L1heavy hitters algorithm even further and to design a number of additional algorithms for graph, string, and linear algebra problems. The existence of such algorithms is surprising, as the streaming algorithm does not even have a secret key in this model, i.e., its state is entirely known to the adversary. One algorithm we design is for estimating the number of distinct elements in amore »
This content will become publicly available on January 1, 2023
Privately Estimating Graph Parameters in Sublinear Time
We initiate a systematic study of algorithms that are both differentiallyprivate and run in sublinear time for several problems in which the goal is to estimate natural graph parameters.
Our main result is a differentiallyprivate $(1+\rho)$approximation algorithm for the problem of computing the average degree of a graph, for every $\rho>0$. The running time of the algorithm is roughly the same (for sparse graphs) as its nonprivate version proposed by Goldreich and Ron (Sublinear Algorithms, 2005).
We also obtain the first differentiallyprivate sublineartime approximation algorithms for the maximum matching size and the minimum vertex cover size of a graph.
An overarching technique we employ is the notion of \emph{coupled global sensitivity} of randomized algorithms. Related variants of this notion of sensitivity have been used in the literature in adhoc ways. Here we formalize the notion and develop it as a unifying framework for privacy analysis of randomized approximation algorithms.
 Editors:
 Mikołaj Boja´nczyk, Emanuela Merelli
 Publication Date:
 NSFPAR ID:
 10322751
 Journal Name:
 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
 Sponsoring Org:
 National Science Foundation
More Like this


We consider the problem of designing sublinear time algorithms for estimating the cost of minimum] metric traveling salesman (TSP) tour. Specifically, given access to a n × n distance matrix D that specifies pairwise distances between n points, the goal is to estimate the TSP cost by performing only sublinear (in the size of D) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any epsilon > 0, there exists an O^~(n/epsilon^O(1))time algorithm that returns a (1+epsilon)approximate estimate of the MST cost. This result immediately implies an O^~(n/epsilon^O(1)) time algorithm to estimate the TSP cost to within a (2 + epsilon) factor for any epsilon > 0. However, no o(n^2)time algorithms are known to approximate metric TSP to a factor that is strictly better than 2. On the other hand, there were also no known barriers that rule out existence of (1 + epsilon)approximate estimation algorithms for metric TSP with O^~ (n) time for any fixed epsilon > 0. In this paper, we make progress on both algorithms and lower bounds for estimating metric TSP cost. On the algorithmic side, we first consider the graphic TSP problemmore »

We present an $\tilde O(m+n^{1.5})$time randomized algorithm for maximum cardinality bipartite matching and related problems (e.g. transshipment, negativeweight shortest paths, and optimal transport) on $m$edge, $n$node graphs. For maximum cardinality bipartite matching on moderately dense graphs, i.e. $m = \Omega(n^{1.5})$, our algorithm runs in time nearly linear in the input size and constitutes the first improvement over the classic $O(m\sqrt{n})$time [Dinic 1970; HopcroftKarp 1971; Karzanov 1973] and $\tilde O(n^\omega)$time algorithms [IbarraMoran 1981] (where currently $\omega\approx 2.373$). On sparser graphs, i.e. when $m = n^{9/8 + \delta}$ for any constant $\delta>0$, our result improves upon the recent advances of [Madry 2013] and [LiuSidford 2020b, 2020a] which achieve an $\tilde O(m^{4/3+o(1)})$ runtime. We obtain these results by combining and advancing recent lines of research in interior point methods (IPMs) and dynamic graph algorithms. First, we simplify and improve the IPM of [v.d.BrandLeeSidfordSong 2020], providing a general primaldual IPM framework and new samplingbased techniques for handling infeasibility induced by approximate linear system solvers. Second, we provide a simple sublineartime algorithm for detecting and sampling highenergy edges in electric flows on expanders and show that when combined with recent advances in dynamic expander decompositions, this yields efficient data structures for maintaining the iterates ofmore »

Current flow closeness centrality (CFCC) has a better discriminating ability than the ordinary closeness centrality based on shortest paths. In this paper, we extend this notion to a group of vertices in a weighted graph, and then study the problem of finding a subset S of k vertices to maximize its CFCC C(S), both theoretically and experimentally. We show that the problem is NPhard, but propose two greedy algorithms for minimizing the reciprocal of C(S) with provable guarantees using the monotoncity and supermodularity. The first is a deterministic algorithm with an approximation factor (1−kk−1⋅1e) and cubic running time; while the second is a randomized algorithm with a (1−kk−1⋅1e−ϵ)approximation and nearlylinear running time for any ϵ>0. Extensive experiments on model and real networks demonstrate that our algorithms are effective and efficient, with the second algorithm being scalable to massive networks with more than a million vertices.

We study the communication cost (or message complexity) of fundamental distributed symmetry breaking problems, namely, coloring and MIS. While significant progress has been made in understanding and improving the running time of such problems, much less is known about the message complexity of these problems. In fact, all known algorithms need at least Ω(m) communication for these problems, where m is the number of edges in the graph. We addressthe following question in this paper: can we solve problems such as coloring and MIS using sublinear, i.e., o(m) communication, and if sounder what conditions? In a classical result, Awerbuch, Goldreich, Peleg, and Vainish [JACM 1990] showed that fundamental global problems such asbroadcast and spanning tree construction require at least o(m) messages in the KT1 Congest model (i.e., Congest model in which nodes have initial knowledge of the neighbors' ID's) when algorithms are restricted to be comparisonbased (i.e., algorithms inwhich node ID's can only be compared). Thirty five years after this result, King, Kutten, and Thorup [PODC 2015] showed that onecan solve the above problems using Õ(n) messages (n is the number of nodes in the graph) in Õ(n) rounds in the KT1 Congest model if noncomparisonbased algorithms are permitted. Anmore »