skip to main content


Title: An Efficient Framework for Balancing Submodularity and Cost
In the classical selection problem, the input consists of a collection of elements and the goal is to pick a subset of elements from the collection such that some objective function f is maximized. This problem has been studied extensively in the data-mining community and it has multiple applications including influence maximization in social networks, team formation and recommender systems. A particularly popular formulation that captures the needs of many such applications is one where the objective function f is a monotone and non-negative submodular function. In these cases, the corresponding computational problem can be solved using a simple greedy (1- 1/e)-approximation algorithm. In this paper, we consider a generalization of the above formulation where the goal is to optimize a function that maximizes the submodular function f minus a linear cost function c. This formulation appears as a more natural one, particularly when one needs to strike a balance between the value of the objective function and the cost being paid in order to pick the selected elements. We address variants of this problem both in an offline setting, where the collection is known apriori, as well as in online settings, where the elements of the collection arrive in an online fashion. We demonstrate that by using simple variants of the standard greedy algorithm (used for submodular optimization) we can design algorithms that have provable approximation guarantees, are extremely efficient and work very well in practice.  more » « less
Award ID(s):
1908510 1750333
NSF-PAR ID:
10293358
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
KDD
ISSN:
2154-817X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In the classical selection problem, the input consists of a collection of elements and the goal is to pick a subset of elements from the collection such that some objective function $f$ is maximized. This problem has been studied extensively in the data-mining community and it has multiple applications including influence maximization in social networks, team formation and recommender systems. A particularly popular formulation that captures the needs of many such applications is one where the objective function $f$ is a monotone and non-negative submodular function. In these cases, the corresponding computational problem can be solved using a simple greedy $(1-1/e)$-approximation algorithm. In this paper, we consider a generalization of the above formulation where the goal is to optimize a function that maximizes the submodular function $f$ minus a linear cost function $c$. This formulation appears as a more natural one, particularly when one needs to strike a balance between the value of the objective function and the cost being paid in order to pick the selected elements. We address variants of this problem both in an offline setting, where the collection is known apriori, as well as in online settings, where the elements of the collection arrive in an online fashion. We demonstrate that by using simple variants of the standard greedy algorithm (used for submodular optimization) we can design algorithms that have provable approximation guarantees, are extremely efficient and work very well in practice. 
    more » « less
  2. Abstract

    We consider the problem of covering multiple submodular constraints. Given a finite ground setN, a weight function$$w: N \rightarrow \mathbb {R}_+$$w:NR+,rmonotone submodular functions$$f_1,f_2,\ldots ,f_r$$f1,f2,,froverNand requirements$$k_1,k_2,\ldots ,k_r$$k1,k2,,krthe goal is to find a minimum weight subset$$S \subseteq N$$SNsuch that$$f_i(S) \ge k_i$$fi(S)kifor$$1 \le i \le r$$1ir. We refer to this problem asMulti-Submod-Coverand it was recently considered by Har-Peled and Jones (Few cuts meet many point sets. CoRR.arxiv:abs1808.03260Har-Peled and Jones 2018) who were motivated by an application in geometry. Even with$$r=1$$r=1Multi-Submod-Covergeneralizes the well-known Submodular Set Cover problem (Submod-SC), and it can also be easily reduced toSubmod-SC. A simple greedy algorithm gives an$$O(\log (kr))$$O(log(kr))approximation where$$k = \sum _i k_i$$k=ikiand this ratio cannot be improved in the general case. In this paper, motivated by several concrete applications, we consider two ways to improve upon the approximation given by the greedy algorithm. First, we give a bicriteria approximation algorithm forMulti-Submod-Coverthat covers each constraint to within a factor of$$(1-1/e-\varepsilon )$$(1-1/e-ε)while incurring an approximation of$$O(\frac{1}{\epsilon }\log r)$$O(1ϵlogr)in the cost. Second, we consider the special case when each$$f_i$$fiis a obtained from a truncated coverage function and obtain an algorithm that generalizes previous work on partial set cover (Partial-SC), covering integer programs (CIPs) and multiple vertex cover constraints Bera et al. (Theoret Comput Sci 555:2–8 Bera et al. 2014). Both these algorithms are based on mathematical programming relaxations that avoid the limitations of the greedy algorithm. We demonstrate the implications of our algorithms and related ideas to several applications ranging from geometric covering problems to clustering with outliers. Our work highlights the utility of the high-level model and the lens of submodularity in addressing this class of covering problems.

     
    more » « less
  3. The densest subgraph problem in a graph (\dsg), in the simplest form, is the following. Given an undirected graph $G=(V,E)$ find a subset $S \subseteq V$ of vertices that maximizes the ratio $|E(S)|/|S|$ where $E(S)$ is the set of edges with both endpoints in $S$. \dsg and several of its variants are well-studied in theory and practice and have many applications in data mining and network analysis. In this paper we study fast algorithms and structural aspects of \dsg via the lens of \emph{supermodularity}. For this we consider the densest supermodular subset problem (\dssp): given a non-negative supermodular function $f: 2^V \rightarrow \mathbb{R}_+$, maximize $f(S)/|S|$. For \dsg we describe a simple flow-based algorithm that outputs a $(1-\eps)$-approximation in deterministic $\tilde{O}(m/\eps)$ time where $m$ is the number of edges. Our algorithm is the first to have a near-linear dependence on $m$ and $1/\eps$ and improves previous methods based on an LP relaxation. It generalizes to hypergraphs, and also yields a faster algorithm for directed \dsg. Greedy peeling algorithms have been very popular for \dsg and several variants due to their efficiency, empirical performance, and worst-case approximation guarantees. We describe a simple peeling algorithm for \dssp and analyze its approximation guarantee in a fashion that unifies several existing results. Boob et al.\ \cite{bgpstww-20} developed an \emph{iterative} peeling algorithm for \dsg which appears to work very well in practice, and made a conjecture about its convergence to optimality. We affirmatively answer their conjecture, and in fact prove that a natural generalization of their algorithm converges to a $(1-\eps)$-approximation for \emph{any} supermodular function $f$; the key to our proof is to consider an LP formulation that is derived via the \Lovasz extension of a supermodular function. For \dsg the bound on the number of iterations we prove is $O(\frac{\Delta \ln |V|}{\lambda^*}\cdot \frac{1}{\eps^2})$ where $\Delta$ is the maximum degree and $\lambda^*$ is the optimum value. Our work suggests that iterative peeling can be an effective heuristic for several objectives considered in the literature. Finally, we show that the $2$-approximation for densest-at-least-$k$ subgraph \cite{ks-09} extends to the supermodular setting. We also give a unified analysis of the peeling algorithm for this problem, and via this analysis derive an approximation guarantee for a generalization of \dssp to maximize $f(S)/g(|S|)$ for a concave function $g$. 
    more » « less
  4. Megow, Nicole ; Smith, Adam (Ed.)
    Maximum weight independent set (MWIS) admits a 1/k-approximation in inductively k-independent graphs [Karhan Akcoglu et al., 2002; Ye and Borodin, 2012] and a 1/(2k)-approximation in k-perfectly orientable graphs [Kammer and Tholey, 2014]. These are a parameterized class of graphs that generalize k-degenerate graphs, chordal graphs, and intersection graphs of various geometric shapes such as intervals, pseudo-disks, and several others [Ye and Borodin, 2012; Kammer and Tholey, 2014]. We consider a generalization of MWIS to a submodular objective. Given a graph G = (V,E) and a non-negative submodular function f: 2^V → ℝ_+, the goal is to approximately solve max_{S ∈ ℐ_G} f(S) where ℐ_G is the set of independent sets of G. We obtain an Ω(1/k)-approximation for this problem in the two mentioned graph classes. The first approach is via the multilinear relaxation framework and a simple contention resolution scheme, and this results in a randomized algorithm with approximation ratio at least 1/e(k+1). This approach also yields parallel (or low-adaptivity) approximations. Motivated by the goal of designing efficient and deterministic algorithms, we describe two other algorithms for inductively k-independent graphs that are inspired by work on streaming algorithms: a preemptive greedy algorithm and a primal-dual algorithm. In addition to being simpler and faster, these algorithms, in the monotone submodular case, yield the first deterministic constant factor approximations for various special cases that have been previously considered such as intersection graphs of intervals, disks and pseudo-disks. 
    more » « less
  5. Given a graph G = (V, E) and a subset T ⊆ V of terminals, a Steiner tree of G is a tree that spans T. In the vertex-weighted Steiner tree (VST) problem, each vertex is assigned a non-negative weight, and the goal is to compute a minimum weight Steiner tree of G. Vertex-weighted problems have applications in network design and routing, where there are different costs for installing or maintaining facilities at different vertices. We study a natural generalization of the VST problem motivated by multi-level graph construction, the vertex-weighted grade-of-service Steiner tree problem (V-GSST), which can be stated as follows: given a graph G and terminals T, where each terminal v ∈ T requires a facility of a minimum grade of service R(v) ∈ {1, 2, . . . `}, compute a Steiner tree G0 by installing facilities on a subset of vertices, such that any two vertices requiring a certain grade of service are connected by a path in G 0 with the minimum grade of service or better. Facilities of higher grade are more costly than facilities of lower grade. Multi-level variants such as this one can be useful in network design problems where vertices may require facilities of varying priority. While similar problems have been studied in the edge-weighted case, they have not been studied as well in the more general vertex-weighted case. We first describe a simple heuristic for the V-GSST problem whose approximation ratio depends on `, the number of grades of service. We then generalize the greedy algorithm of [Klein & Ravi, 1995] to show that the V-GSST problem admits a (2 ln |T|)-approximation, where T is the set of terminals requiring some facility. This result is surprising, as it shows that the (seemingly harder) multi-grade problem can be approximated as well as the VST problem, and that the approximation ratio does not depend on the number of grades of service. Finally, we show that this problem is a special case of the directed Steiner tree problem and provide an integer linear programming (ILP) formulation for the V-GSST problem. 
    more » « less