There has been a flurry of recent literature studying streaming algorithms for which the input stream is chosen adaptively by a blackbox adversary who observes the output of the streaming algorithm at each time step. However, these algorithms fail when the adversary has access to the internal state of the algorithm, rather than just the output of the algorithm. We study streaming algorithms in the whitebox adversarial model, where the stream is
chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible. We first give a randomized algorithm for the L1heavy hitters problem that outperforms the optimal deterministic MisraGries algorithm on long streams. If the whitebox adversary is computationally bounded, we use cryptographic techniques to reduce the memory of our L1heavy hitters algorithm even further
and to design a number of additional algorithms for graph, string, and linear algebra problems. The existence of such algorithms is surprising, as the streaming algorithm does not even have a secret key in this model, i.e., its state is entirely known to the adversary. One algorithm we design is for estimating the number of distinct elements in a stream with insertions and deletions achieving a multiplicative approximation and sublinear space; such an algorithm is impossible for deterministic algorithms. We also give a general technique that translates any twoplayer deterministic communication lower bound to a lower bound for randomized algorithms robust to a whitebox adversary. In particular, our results show that for all p ≥ 0, there exists a constant Cp > 1 such that any
Cpapproximation algorithm for Fp moment estimation in insertiononly streams with a whitebox adversary requires Ω(n) space for a universe of size n. Similarly, there is a constant C > 1 such that any Capproximation algorithm in an insertiononly stream for matrix rank requires Ω(n) space with a whitebox adversary. These results do not contradict our upper bounds since they assume the adversary has unbounded computational power. Our algorithmic results based on cryptography thus show a separation between computationally bounded and unbounded adversaries. Finally, we prove a lower bound of Ω(log n) bits for the fundamental problem of deterministic approximate counting in a stream of 0’s and 1’s, which holds even if we know how many total stream updates we have seen so far at each point in the stream. Such a lower bound for approximate counting with additional information was previously unknown, and in our context, it shows a separation between multiplayer deterministic maximum communication and the whitebox space complexity of a streaming algorithm
more »
« less
Privately Estimating Graph Parameters in Sublinear Time
We initiate a systematic study of algorithms that are both differentiallyprivate and run in sublinear time for several problems in which the goal is to estimate natural graph parameters.
Our main result is a differentiallyprivate $(1+\rho)$approximation algorithm for the problem of computing the average degree of a graph, for every $\rho>0$. The running time of the algorithm is roughly the same (for sparse graphs) as its nonprivate version proposed by Goldreich and Ron (Sublinear Algorithms, 2005).
We also obtain the first differentiallyprivate sublineartime approximation algorithms for the maximum matching size and the minimum vertex cover size of a graph.
An overarching technique we employ is the notion of \emph{coupled global sensitivity} of randomized algorithms. Related variants of this notion of sensitivity have been used in the literature in adhoc ways. Here we formalize the notion and develop it as a unifying framework for privacy analysis of randomized approximation algorithms.
more »
« less
 NSFPAR ID:
 10322751
 Editor(s):
 Mikołaj Boja´nczyk, Emanuela Merelli
 Date Published:
 Journal Name:
 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


We consider the problem of designing sublinear time algorithms for estimating the cost of minimum] metric traveling salesman (TSP) tour. Specifically, given access to a n × n distance matrix D that specifies pairwise distances between n points, the goal is to estimate the TSP cost by performing only sublinear (in the size of D) queries. For the closely related problem of estimating the weight of a metric minimum spanning tree (MST), it is known that for any epsilon > 0, there exists an O^~(n/epsilon^O(1))time algorithm that returns a (1+epsilon)approximate estimate of the MST cost. This result immediately implies an O^~(n/epsilon^O(1)) time algorithm to estimate the TSP cost to within a (2 + epsilon) factor for any epsilon > 0. However, no o(n^2)time algorithms are known to approximate metric TSP to a factor that is strictly better than 2. On the other hand, there were also no known barriers that rule out existence of (1 + epsilon)approximate estimation algorithms for metric TSP with O^~ (n) time for any fixed epsilon > 0. In this paper, we make progress on both algorithms and lower bounds for estimating metric TSP cost. On the algorithmic side, we first consider the graphic TSP problem where the metric D corresponds to shortest path distances in a connected unweighted undirected graph. We show that there exists an O^~(n) time algorithm that estimates the cost of graphic TSP to within a factor of (2 − epsilon_0) for some epsilon_0 > 0. This is the first sublinear cost estimation algorithm for graphic TSP that achieves an approximation factor less than 2. We also consider another wellstudied special case of metric TSP, namely, (1, 2)TSP where all distances are either 1 or 2, and give an O^~(n ^ 1.5) time algorithm to estimate optimal cost to within a factor of 1.625. Our estimation algorithms for graphic TSP as well as for (1, 2)TSP naturally lend themselves to O^~(n) space streaming algorithms that give an 11/6approximation for graphic TSP and a 1.625approximation for (1, 2)TSP. These results motivate the natural question if analogously to metric MST, for any epsilon > 0, (1 + epsilon)approximate estimates can be obtained for graphic TSP and (1, 2)TSP using O^~ (n) queries. We answer this question in the negative – there exists an epsilon_0 > 0, such that any algorithm that estimates the cost of graphic TSP ((1, 2)TSP) to within a (1 + epsilon_0)factor, necessarily requires (n^2) queries. This lower bound result highlights a sharp separation between the metric MST and metric TSP problems. Similarly to many classical approximation algorithms for TSP, our sublinear time estimation algorithms utilize subroutines for estimating the size of a maximum matching in the underlying graph. We show that this is not merely an artifact of our approach, and that for any epsilon > 0, any algorithm that estimates the cost of graphic TSP or (1, 2)TSP to within a (1 + epsilon)factor, can also be used to estimate the size of a maximum matching in a bipartite graph to within an epsilon n additive error. This connection allows us to translate known lower bounds for matching size estimation in various models to similar lower bounds for metric TSP cost estimation.more » « less

Woodruff, David P. (Ed.)We give improved algorithms for maintaining edgeorientations of a fullydynamic graph, such that the maximum outdegree is bounded. On one hand, we show how to orient the edges such that maximum outdegree is proportional to the arboricity $\alpha$ of the graph, in, either, an amortised update time of $O(\log^2 n \log \alpha)$, or a worstcase update time of $O(\log^3 n \log \alpha)$. On the other hand, motivated by applications including dynamic maximal matching, we obtain a different tradeoff. Namely, the improved update time of either $O(\log n \log \alpha)$, amortised, or $O(\log ^2 n \log \alpha)$, worstcase, for the problem of maintaining an edgeorientation with at most $O(\alpha + \log n)$ outedges per vertex. Finally, all of our algorithms naturally limit the recourse to be polylogarithmic in $n$ and $\alpha$. Our algorithms adapt to the current arboricity of the graph, and yield improvements over previous work: Firstly, we obtain deterministic algorithms for maintaining a $(1+\varepsilon)$ approximation of the maximum subgraph density, $\rho$, of the dynamic graph. Our algorithms have update times of $O(\varepsilon^{6}\log^3 n \log \rho)$ worstcase, and $O(\varepsilon^{4}\log^2 n \log \rho)$ amortised, respectively. We may output a subgraph $H$ of the input graph where its density is a $(1+\varepsilon)$ approximation of the maximum subgraph density in time linear in the size of the subgraph. These algorithms have improved update time compared to the $O(\varepsilon^{6}\log ^4 n)$ algorithm by Sawlani and Wang from STOC 2020. Secondly, we obtain an $O(\varepsilon^{6}\log^3 n \log \alpha)$ worstcase update time algorithm for maintaining a $(1~+~\varepsilon)\textnormal{OPT} + 2$ approximation of the optimal outorientation of a graph with adaptive arboricity $\alpha$, improving the $O(\varepsilon^{6}\alpha^2 \log^3 n)$ algorithm by Christiansen and Rotenberg from ICALP 2022. This yields the first worstcase polylogarithmic dynamic algorithm for decomposing into $O(\alpha)$ forests. Thirdly, we obtain arboricityadaptive fullydynamic deterministic algorithms for a variety of problems including maximal matching, $\Delta+1$ colouring, and matrix vector multiplication. All update times are worstcase $O(\alpha+\log^2n \log \alpha)$, where $\alpha$ is the current arboricity of the graph. For the maximal matching problem, the stateoftheart deterministic algorithms by Kopelowitz, Krauthgamer, Porat, and Solomon from ICALP 2014 runs in time $O(\alpha^2 + \log^2 n)$, and by Neiman and Solomon from STOC 2013 runs in time $O(\sqrt{m})$. We give improved running times whenever the arboricity $\alpha \in \omega( \log n\sqrt{\log\log n})$.more » « less

Etessami, Kousha ; Feige, Uriel ; Puppis, Gabriele (Ed.)Many deployments of differential privacy in industry are in the local model, where each party releases its private information via a differentially private randomizer. We study triangle counting in the noninteractive and interactive local model with edge differential privacy (that, intuitively, requires that the outputs of the algorithm on graphs that differ in one edge be indistinguishable). In this model, each party’s local view consists of the adjacency list of one vertex. In the noninteractive model, we prove that additive Ω(n²) error is necessary, where n is the number of nodes. This lower bound is our main technical contribution. It uses a reconstruction attack with a new class of linear queries and a novel mixandmatch strategy of running the local randomizers with different completions of their adjacency lists. It matches the additive error of the algorithm based on Randomized Response, proposed by Imola, Murakami and Chaudhuri (USENIX2021) and analyzed by Imola, Murakami and Chaudhuri (CCS2022) for constant ε. We use a different postprocessing of Randomized Response and provide tight bounds on the variance of the resulting algorithm. In the interactive setting, we prove a lower bound of Ω(n^{3/2}) on the additive error. Previously, no hardness results were known for interactive, edgeprivate algorithms in the local model, except for those that follow trivially from the results for the central model. Our work significantly improves on the state of the art in differentially private graph analysis in the local model.more » « less

Many deployments of differential privacy in industry are in the local model, where each party releases its private information via a differentially private randomizer. We study triangle counting in the noninteractive and interactive local model with edge differential privacy (that, intuitively, requires that the outputs of the algorithm on graphs that differ in one edge be indistinguishable). In this model, each party’s local view consists of the adjacency list of one vertex. In the noninteractive model, we prove that additive Ω(n^2) error is necessary, where n is the number of nodes. This lower bound is our main technical contribution. It uses a reconstruction attack with a new class of linear queries and a novel mixandmatch strategy of running the local randomizers with different completions of their adjacency lists. It matches the additive error of the algorithm based on Randomized Response, proposed by Imola, Murakami and Chaudhuri (USENIX2021) and analyzed by Imola, Murakami and Chaudhuri (CCS2022) for constant ε. We use a different postprocessing of Randomized Response and provide tight bounds on the variance of the resulting algorithm. In the interactive setting, we prove a lower bound of Ω(n3/2) on the additive error. Previously, no hardness results were known for interactive, edgeprivate algorithms in the local model, except for those that follow trivially from the results for the central model. Our work significantly improves on the state of the art in differentially private graph analysis in the local model.more » « less