skip to main content


Title: Set Cover in Sub-linear Time
We study the classic set cover problem from the perspective of sub-linear algorithms. Given access to a collection of m sets over n elements in the query model, we show that sub-linear algorithms derived from existing techniques have almost tight query complexities. On one hand, first we show an adaptation of the streaming algorithm presented in [17] to the sub-linear query model, that returns an α-approximate cover using Õ(m(n/k)^1/(α–1) + nk) queries to the input, where k denotes the value of a minimum set cover. We then complement this upper bound by proving that for lower values of k, the required number of queries is , even for estimating the optimal cover size. Moreover, we prove that even checking whether a given collection of sets covers all the elements would require Ω(nk) queries. These two lower bounds provide strong evidence that the upper bound is almost tight for certain values of the parameter k. On the other hand, we show that this bound is not optimal for larger values of the parameter k, as there exists a (1 + ε)-approximation algorithm with Õ(mn/kε^2) queries. We show that this bound is essentially tight for sufficiently small constant ε, by establishing a lower bound of query complexity. Our lower-bound results follow by carefully designing two distributions of instances that are hard to distinguish. In particular, our first lower bound involves a probabilistic construction of a certain set system with a minimum set cover of size αk, with the key property that a small number of “almost uniformly distributed” modifications can reduce the minimum set cover size down to k. Thus, these modifications are not detectable unless a large number of queries are asked. We believe that our probabilistic construction technique might find applications to lower bounds for other combinatorial optimization problems.  more » « less
Award ID(s):
1740751 1535851 1447476 1741137 1650733
PAR ID:
10065216
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Annual ACM-SIAM Symposium on Discrete Algorithms
Page Range / eLocation ID:
2467-2486
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we consider two fundamental cut approximation problems on large graphs. We prove new lower bounds for both problems that are optimal up to logarithmic factors.

    The first problem is approximating cuts in balanced directed graphs. In this problem, we want to build a data structure that can provide (1 ± ε)-approximation of cut values on a graph with n vertices. For arbitrary directed graphs, such a data structure requires Ω(n2) bits even for constant ε. To circumvent this, recent works study β-balanced graphs, meaning that for every directed cut, the total weight of edges in one direction is at most β times the total weight in the other direction. We consider the for-each model, where the goal is to approximate each cut with constant probability, and the for-all model, where all cuts must be preserved simultaneously. We improve the previous Ømega(n √β/ε) lower bound in the for-each model to ~Ω (n √β /ε) and we improve the previous Ω(n β/ε) lower bound in the for-all model to Ω(n β/ε2). This resolves the main open questions of (Cen et al., ICALP, 2021).

    The second problem is approximating the global minimum cut in a local query model, where we can only access the graph via degree, edge, and adjacency queries. We prove an ΩL(min m, m/ε2k R) lower bound for this problem, which improves the previous ΩL(m/k R) lower bound, where m is the number of edges, k is the minimum cut size, and we seek a (1+ε)-approximation. In addition, we show that existing upper bounds with minor modifications match our lower bound up to logarithmic factors.

     
    more » « less
  2. Megow, Nicole ; Smith, Adam (Ed.)
    We provide new approximation algorithms for the Red-Blue Set Cover and Circuit Minimum Monotone Satisfying Assignment (MMSA) problems. Our algorithm for Red-Blue Set Cover achieves Õ(m^{1/3})-approximation improving on the Õ(m^{1/2})-approximation due to Elkin and Peleg (where m is the number of sets). Our approximation algorithm for MMSA_t (for circuits of depth t) gives an Õ(N^{1-δ}) approximation for δ = 1/3 2^{3-⌈t/2⌉}, where N is the number of gates and variables. No non-trivial approximation algorithms for MMSA_t with t ≥ 4 were previously known. We complement these results with lower bounds for these problems: For Red-Blue Set Cover, we provide a nearly approximation preserving reduction from Min k-Union that gives an Ω(m^{1/4 - ε}) hardness under the Dense-vs-Random conjecture, while for MMSA we sketch a proof that an SDP relaxation strengthened by Sherali-Adams has an integrality gap of N^{1-ε} where ε → 0 as the circuit depth t → ∞. 
    more » « less
  3. Guruswami, Venkatesan (Ed.)
    Inspired by the classic problem of Boolean function monotonicity testing, we investigate the testability of other well-studied properties of combinatorial finite set systems, specifically intersecting families and union-closed families. A function f: {0,1}ⁿ → {0,1} is intersecting (respectively, union-closed) if its set of satisfying assignments corresponds to an intersecting family (respectively, a union-closed family) of subsets of [n]. Our main results are that - in sharp contrast with the property of being a monotone set system - the property of being an intersecting set system, and the property of being a union-closed set system, both turn out to be information-theoretically difficult to test. We show that: - For ε ≥ Ω(1/√n), any non-adaptive two-sided ε-tester for intersectingness must make 2^{Ω(n^{1/4}/√{ε})} queries. We also give a 2^{Ω(√{n log(1/ε)})}-query lower bound for non-adaptive one-sided ε-testers for intersectingness. - For ε ≥ 1/2^{Ω(n^{0.49})}, any non-adaptive two-sided ε-tester for union-closedness must make n^{Ω(log(1/ε))} queries. Thus, neither intersectingness nor union-closedness shares the poly(n,1/ε)-query non-adaptive testability that is enjoyed by monotonicity. To complement our lower bounds, we also give a simple poly(n^{√{nlog(1/ε)}},1/ε)-query, one-sided, non-adaptive algorithm for ε-testing each of these properties (intersectingness and union-closedness). We thus achieve nearly tight upper and lower bounds for two-sided testing of intersectingness when ε = Θ(1/√n), and for one-sided testing of intersectingness when ε = Θ(1). 
    more » « less
  4. We design a nonadaptive algorithm that, given a Boolean function f: {0, 1}^n → {0, 1} which is α-far from monotone, makes poly(n, 1/α) queries and returns an estimate that, with high probability, is an O-tilde(\sqrt{n})-approximation to the distance of f to monotonicity. Furthermore, we show that for any constant k > 0, approximating the distance to monotonicity up to n^(1/2−k)-factor requires 2^{n^k} nonadaptive queries, thereby ruling out a poly(n, 1/α)-query nonadaptive algorithm for such approximations. This answers a question of Seshadhri (Property Testing Review, 2014) for the case of nonadaptive algorithms. Approximating the distance to a property is closely related to tolerantly testing that property. Our lower bound stands in contrast to standard (non-tolerant) testing of monotonicity that can be done nonadaptively with O-tilde(n/ε^2) queries. We obtain our lower bound by proving an analogous bound for erasure-resilient testers. An α-erasure-resilient tester for a desired property gets oracle access to a function that has at most an α fraction of values erased. The tester has to accept (with probability at least 2/3) if the erasures can be filled in to ensure that the resulting function has the property and to reject (with probability at least 2/3) if every completion of erasures results in a function that is ε-far from having the property. Our method yields the same lower bounds for unateness and being a k-junta. These lower bounds improve exponentially on the existing lower bounds for these properties. 
    more » « less
  5. null (Ed.)
    Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where k ≥ 2 machines jointly perform computations on graphs with n nodes (typically, n >> k). The input graph is assumed to be initially randomly partitioned among the k machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication rounds of the computation. Our main contribution is the General Lower Bound Theorem , a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. This result is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic, and this theorem can be used in a “cookbook” fashion to show distributed lower bounds for several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds on the round complexity of two fundamental graph problems, namely, PageRank computation and triangle enumeration . These applications show that our approach can yield lower bounds for problems where the application of communication complexity techniques seems not obvious or gives weak bounds, including and especially under a stochastic partition of the input. We then present distributed algorithms for PageRank and triangle enumeration with a round complexity that (almost) matches the respective lower bounds; these algorithms exhibit a round complexity that scales superlinearly in k , improving significantly over previous results [Klauck et al., SODA 2015]. Specifically, we show the following results: PageRank: We show a lower bound of Ὼ(n/k 2 ) rounds and present a distributed algorithm that computes an approximation of the PageRank of all the nodes of a graph in Õ(n/k 2 ) rounds. Triangle enumeration: We show that there exist graphs with m edges where any distributed algorithm requires Ὼ(m/k 5/3 ) rounds. This result also implies the first non-trivial lower bound of Ὼ(n 1/3 ) rounds for the congested clique model, which is tight up to logarithmic factors. We then present a distributed algorithm that enumerates all the triangles of a graph in Õ(m/k 5/3 + n/k 4/3 ) rounds. 
    more » « less