skip to main content

Title: Algorithms for covering multiple submodular constraints and applications

We consider the problem of covering multiple submodular constraints. Given a finite ground setN, a weight function$$w: N \rightarrow \mathbb {R}_+$$w:NR+,rmonotone submodular functions$$f_1,f_2,\ldots ,f_r$$f1,f2,,froverNand requirements$$k_1,k_2,\ldots ,k_r$$k1,k2,,krthe goal is to find a minimum weight subset$$S \subseteq N$$SNsuch that$$f_i(S) \ge k_i$$fi(S)kifor$$1 \le i \le r$$1ir. We refer to this problem asMulti-Submod-Coverand it was recently considered by Har-Peled and Jones (Few cuts meet many point sets. CoRR.arxiv:abs1808.03260Har-Peled and Jones 2018) who were motivated by an application in geometry. Even with$$r=1$$r=1Multi-Submod-Covergeneralizes the well-known Submodular Set Cover problem (Submod-SC), and it can also be easily reduced toSubmod-SC. A simple greedy algorithm gives an$$O(\log (kr))$$O(log(kr))approximation where$$k = \sum _i k_i$$k=ikiand this ratio cannot be improved in the general case. In this paper, motivated by several concrete applications, we consider two ways to improve upon the approximation given by the greedy algorithm. First, we give a bicriteria approximation algorithm forMulti-Submod-Coverthat covers each constraint to within a factor of$$(1-1/e-\varepsilon )$$(1-1/e-ε)while incurring an approximation of$$O(\frac{1}{\epsilon }\log r)$$O(1ϵlogr)in the cost. Second, we consider the special case when each$$f_i$$fiis a obtained from a truncated coverage function and obtain an algorithm that generalizes previous work on partial set cover (Partial-SC), covering integer programs (CIPs) and multiple vertex cover constraints more » Bera et al. (Theoret Comput Sci 555:2–8 Bera et al. 2014). Both these algorithms are based on mathematical programming relaxations that avoid the limitations of the greedy algorithm. We demonstrate the implications of our algorithms and related ideas to several applications ranging from geometric covering problems to clustering with outliers. Our work highlights the utility of the high-level model and the lens of submodularity in addressing this class of covering problems.

« less
; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Journal of Combinatorial Optimization
Page Range or eLocation-ID:
p. 979-1010
Springer Science + Business Media
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    We continue the program of proving circuit lower bounds via circuit satisfiability algorithms. So far, this program has yielded several concrete results, proving that functions in$\mathsf {Quasi}\text {-}\mathsf {NP} = \mathsf {NTIME}[n^{(\log n)^{O(1)}}]$Quasi-NP=NTIME[n(logn)O(1)]and other complexity classes do not have small circuits (in the worst case and/or on average) from various circuit classes$\mathcal { C}$C, by showing that$\mathcal { C}$Cadmits non-trivial satisfiability and/or#SAT algorithms which beat exhaustive search by a minor amount. In this paper, we present a new strong lower bound consequence of having a non-trivial#SAT algorithm for a circuit class${\mathcal C}$C. Say that a symmetric Boolean functionf(x1,…,xn) issparseif it outputs 1 onO(1) values of${\sum }_{i} x_{i}$ixi. We show that for every sparsef, and for all “typical”$\mathcal { C}$C, faster#SAT algorithms for$\mathcal { C}$Ccircuits imply lower bounds against the circuit class$f \circ \mathcal { C}$fC, which may bestrongerthan$\mathcal { C}$Citself. In particular:

    #SAT algorithms fornk-size$\mathcal { C}$C-circuits running in 2n/nktime (for allk) implyNEXPdoes not have$(f \circ \mathcal { C})$(fC)-circuits of polynomial size.

    #SAT algorithms for$2^{n^{{\varepsilon }}}$2nε-size$\mathcal { C}$C-circuits running in$2^{n-n^{{\varepsilon }}}$2nnεtime (for someε> 0) implyQuasi-NPdoes not have$(f \circ \mathcal { C})$(fC)-circuits of polynomial size.

    Applying#SAT algorithms from the literature, one immediate corollary of our results is thatQuasi-NPdoes not haveEMAJACC0THRcircuits of polynomialmore »size, whereEMAJis the “exact majority” function, improving previous lower bounds againstACC0[Williams JACM’14] andACC0THR[Williams STOC’14], [Murray-Williams STOC’18]. This is the first nontrivial lower bound against such a circuit class.

    « less
  2. Abstract

    It has been recently established in David and Mayboroda (Approximation of green functions and domains with uniformly rectifiable boundaries of all dimensions.arXiv:2010.09793) that on uniformly rectifiable sets the Green function is almost affine in the weak sense, and moreover, in some scenarios such Green function estimates are equivalent to the uniform rectifiability of a set. The present paper tackles a strong analogue of these results, starting with the “flagship degenerate operators on sets with lower dimensional boundaries. We consider the elliptic operators$$L_{\beta ,\gamma } =- {\text {div}}D^{d+1+\gamma -n} \nabla $$Lβ,γ=-divDd+1+γ-nassociated to a domain$$\Omega \subset {\mathbb {R}}^n$$ΩRnwith a uniformly rectifiable boundary$$\Gamma $$Γof dimension$$d < n-1$$d<n-1, the now usual distance to the boundary$$D = D_\beta $$D=Dβgiven by$$D_\beta (X)^{-\beta } = \int _{\Gamma } |X-y|^{-d-\beta } d\sigma (y)$$Dβ(X)-β=Γ|X-y|-d-βdσ(y)for$$X \in \Omega $$XΩ, where$$\beta >0$$β>0and$$\gamma \in (-1,1)$$γ(-1,1). In this paper we show that the Green functionGfor$$L_{\beta ,\gamma }$$Lβ,γ, with pole at infinity, is well approximated by multiples of$$D^{1-\gamma }$$D1-γ, in the sense that the function$$\big | D\nabla \big (\ln \big ( \frac{G}{D^{1-\gamma }} \big )\big )\big |^2$$|D(ln(GD1-γ))|2satisfies a Carleson measure estimate on$$\Omega $$Ω. We underline that the strong and the weak results are different in nature and, of course, at the levelmore »of the proofs: the latter extensively used compactness arguments, while the present paper relies on some intricate integration by parts and the properties of the “magical distance function from David et al. (Duke Math J, to appear).

    « less
  3. Abstract

    Let us fix a primepand a homogeneous system ofmlinear equations$$a_{j,1}x_1+\dots +a_{j,k}x_k=0$$aj,1x1++aj,kxk=0for$$j=1,\dots ,m$$j=1,,mwith coefficients$$a_{j,i}\in \mathbb {F}_p$$aj,iFp. Suppose that$$k\ge 3m$$k3m, that$$a_{j,1}+\dots +a_{j,k}=0$$aj,1++aj,k=0for$$j=1,\dots ,m$$j=1,,mand that every$$m\times m$$m×mminor of the$$m\times k$$m×kmatrix$$(a_{j,i})_{j,i}$$(aj,i)j,iis non-singular. Then we prove that for any (large)n, any subset$$A\subseteq \mathbb {F}_p^n$$AFpnof size$$|A|> C\cdot \Gamma ^n$$|A|>C·Γncontains a solution$$(x_1,\dots ,x_k)\in A^k$$(x1,,xk)Akto the given system of equations such that the vectors$$x_1,\dots ,x_k\in A$$x1,,xkAare all distinct. Here,Cand$$\Gamma $$Γare constants only depending onp,mandksuch that$$\Gamma Γ<p. The crucial point here is the condition for the vectors$$x_1,\dots ,x_k$$x1,,xkin the solution$$(x_1,\dots ,x_k)\in A^k$$(x1,,xk)Akto be distinct. If we relax this condition and only demand that$$x_1,\dots ,x_k$$x1,,xkare not all equal, then the statement would follow easily from Tao’s slice rank polynomial method. However, handling the distinctness condition is much harder, and requires a new approach. While all previous combinatorial applications of the slice rank polynomial method have relied on the slice rank of diagonal tensors, we use a slice rank argument for a non-diagonal tensor in combination with combinatorial and probabilistic arguments.

  4. Abstract

    Sequence mappability is an important task in genome resequencing. In the (km)-mappability problem, for a given sequenceTof lengthn, the goal is to compute a table whoseith entry is the number of indices$$j \ne i$$jisuch that the length-msubstrings ofTstarting at positionsiandjhave at mostkmismatches. Previous works on this problem focused on heuristics computing a rough approximation of the result or on the case of$$k=1$$k=1. We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that, for$$k=O(1)$$k=O(1), works in$$O(n)$$O(n)space and, with high probability, in$$O(n \cdot \min \{m^k,\log ^k n\})$$O(n·min{mk,logkn})time. Our algorithm requires a careful adaptation of thek-errata trees of Cole et al. [STOC 2004] to avoid multiple counting of pairs of substrings. Our technique can also be applied to solve the all-pairs Hamming distance problem introduced by Crochemore et al. [WABI 2017]. We further develop$$O(n^2)$$O(n2)-time algorithms to computeall(km)-mappability tables for a fixedmand all$$k\in \{0,\ldots ,m\}$$k{0,,m}or a fixedkand all$$m\in \{k,\ldots ,n\}$$m{k,,n}. Finally, we show that, for$$k,m = \Theta (\log n)$$k,m=Θ(logn), the (km)-mappability problem cannot be solved in strongly subquadratic time unless the Strong Exponential Time Hypothesis fails. This is an improved and extended version of a paper presented at SPIRE 2018.

  5. Abstract

    We present the KODIAQ-Z survey aimed to characterize the cool, photoionized gas at 2.2 ≲z≲ 3.6 in 202 Hi-selected absorbers with 14.6 ≤logNHI< 20 that probe the interface between galaxies and the intergalactic medium (IGM). We find that gas with14.6logNHI<20at 2.2 ≲z≲ 3.6 can be metal-rich (−1.6 ≲ [X/H] ≲ − 0.2) as seen in damped Lyαabsorbers (DLAs); it can also be very metal-poor ([X/H] < − 2.4) or even pristine ([X/H] < − 3.8), which is not observed in DLAs but is common in the IGM. For16<logNHI<20absorbers, the frequency of pristine absorbers is about 1%–10%, while for14.6logNHI16absorbers it is 10%–20%, similar to the diffuse IGM. Supersolar gas is extremely rare (<1%) at these redshifts. The factor of several thousand spread from the lowest to highest metallicities and large metallicity variations (a factor of a few to >100) between absorbers separated by less than Δv< 500 km s−1imply that the metals are poorly mixed in14.6logNHI<20gas. We show that these photoionized absorbers contribute to aboutmore »14% of the cosmic baryons and 45% of the cosmic metals at 2.2 ≲z≲ 3.6. We find that the mean metallicity increases withNHi, consistent with what is found inz< 1 gas. The metallicity of gas in this column density regime has increased by a factor ∼8 from 2.2 ≲z≲ 3.6 toz< 1, but the contribution of the14.6logNHI<19absorbers to the total metal budget of the universe atz< 1 is a quarter of that at 2.2 ≲z≲ 3.6. We show that FOGGIE cosmological zoom-in simulations have a similar evolution of [X/H] withNHi, which is not observed in lower-resolution simulations. In these simulations, very metal-poor absorbers with [X/H] < − 2.4 atz∼ 2–3 are tracers of inflows, while higher-metallicity absorbers are a mixture of inflows and outflows.

    « less