skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: An Efficient Framework for Balancing Submodularity and Cost
In the classical selection problem, the input consists of a collection of elements and the goal is to pick a subset of elements from the collection such that some objective function $$f$$ is maximized. This problem has been studied extensively in the data-mining community and it has multiple applications including influence maximization in social networks, team formation and recommender systems. A particularly popular formulation that captures the needs of many such applications is one where the objective function $$f$$ is a monotone and non-negative submodular function. In these cases, the corresponding computational problem can be solved using a simple greedy $(1-1/e)$-approximation algorithm. In this paper, we consider a generalization of the above formulation where the goal is to optimize a function that maximizes the submodular function $$f$$ minus a linear cost function $$c$$. This formulation appears as a more natural one, particularly when one needs to strike a balance between the value of the objective function and the cost being paid in order to pick the selected elements. We address variants of this problem both in an offline setting, where the collection is known apriori, as well as in online settings, where the elements of the collection arrive in an online fashion. We demonstrate that by using simple variants of the standard greedy algorithm (used for submodular optimization) we can design algorithms that have provable approximation guarantees, are extremely efficient and work very well in practice.  more » « less
Award ID(s):
1750333 1908510
PAR ID:
10315636
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
KDD
ISSN:
2154-817X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    In the classical selection problem, the input consists of a collection of elements and the goal is to pick a subset of elements from the collection such that some objective function f is maximized. This problem has been studied extensively in the data-mining community and it has multiple applications including influence maximization in social networks, team formation and recommender systems. A particularly popular formulation that captures the needs of many such applications is one where the objective function f is a monotone and non-negative submodular function. In these cases, the corresponding computational problem can be solved using a simple greedy (1- 1/e)-approximation algorithm. In this paper, we consider a generalization of the above formulation where the goal is to optimize a function that maximizes the submodular function f minus a linear cost function c. This formulation appears as a more natural one, particularly when one needs to strike a balance between the value of the objective function and the cost being paid in order to pick the selected elements. We address variants of this problem both in an offline setting, where the collection is known apriori, as well as in online settings, where the elements of the collection arrive in an online fashion. We demonstrate that by using simple variants of the standard greedy algorithm (used for submodular optimization) we can design algorithms that have provable approximation guarantees, are extremely efficient and work very well in practice. 
    more » « less
  2. Megow, Nicole; Smith, Adam (Ed.)
    Maximum weight independent set (MWIS) admits a 1/k-approximation in inductively k-independent graphs [Karhan Akcoglu et al., 2002; Ye and Borodin, 2012] and a 1/(2k)-approximation in k-perfectly orientable graphs [Kammer and Tholey, 2014]. These are a parameterized class of graphs that generalize k-degenerate graphs, chordal graphs, and intersection graphs of various geometric shapes such as intervals, pseudo-disks, and several others [Ye and Borodin, 2012; Kammer and Tholey, 2014]. We consider a generalization of MWIS to a submodular objective. Given a graph G = (V,E) and a non-negative submodular function f: 2^V → ℝ_+, the goal is to approximately solve max_{S ∈ ℐ_G} f(S) where ℐ_G is the set of independent sets of G. We obtain an Ω(1/k)-approximation for this problem in the two mentioned graph classes. The first approach is via the multilinear relaxation framework and a simple contention resolution scheme, and this results in a randomized algorithm with approximation ratio at least 1/e(k+1). This approach also yields parallel (or low-adaptivity) approximations. Motivated by the goal of designing efficient and deterministic algorithms, we describe two other algorithms for inductively k-independent graphs that are inspired by work on streaming algorithms: a preemptive greedy algorithm and a primal-dual algorithm. In addition to being simpler and faster, these algorithms, in the monotone submodular case, yield the first deterministic constant factor approximations for various special cases that have been previously considered such as intersection graphs of intervals, disks and pseudo-disks. 
    more » « less
  3. Evans, Robin; Shpitser, Ilya (Ed.)
    We consider the problem of maximizing submodular functions under submodular constraints by formulating the problem in two ways: \SCSKC and \DiffC. Given two submodular functions $$f$$ and $$g$$ where $$f$$ is monotone, the objective of \SCSKC problem is to find a set $$S$$ of size at most $$k$$ that maximizes $f(S)$ under the constraint that $$g(S)\leq \theta$$, for a given value of $$\theta$$. The problem of \DiffC focuses on finding a set $$S$$ of size at most $$k$$ such that $h(S) = f(S)-g(S)$$ is maximized. It is known that these problems are highly inapproximable and do not admit any constant factor multiplicative approximation algorithms unless NP is easy. Known approximation algorithms involve data-dependent approximation factors that are not efficiently computable. We initiate a study of the design of approximation algorithms where the approximation factors are efficiently computable. For the problem of \SCSKC, we prove that the greedy algorithm produces a solution whose value is at least $$(1-1/e)f(\OPT) - A$, where $$A$$ is the data-dependent additive error. For the \DiffC problem, we design an algorithm that uses the \SCSKC greedy algorithm as a subroutine. This algorithm produces a solution whose value is at least $$(1-1/e)h(\OPT)-B$, where $$B$$ is also a data-dependent additive error. A salient feature of our approach is that the additive error terms can be computed efficiently, thus enabling us to ascertain the quality of the solutions produced. 
    more » « less
  4. Abstract We consider the problem of covering multiple submodular constraints. Given a finite ground set N , a weight function $$w: N \rightarrow \mathbb {R}_+$$ w : N → R + , r monotone submodular functions $$f_1,f_2,\ldots ,f_r$$ f 1 , f 2 , … , f r over N and requirements $$k_1,k_2,\ldots ,k_r$$ k 1 , k 2 , … , k r the goal is to find a minimum weight subset $$S \subseteq N$$ S ⊆ N such that $$f_i(S) \ge k_i$$ f i ( S ) ≥ k i for $$1 \le i \le r$$ 1 ≤ i ≤ r . We refer to this problem as Multi-Submod-Cover and it was recently considered by Har-Peled and Jones (Few cuts meet many point sets. CoRR. arxiv:abs1808.03260 Har-Peled and Jones 2018) who were motivated by an application in geometry. Even with $$r=1$$ r = 1 Multi-Submod-Cover generalizes the well-known Submodular Set Cover problem ( Submod-SC ), and it can also be easily reduced to Submod-SC . A simple greedy algorithm gives an $$O(\log (kr))$$ O ( log ( k r ) ) approximation where $$k = \sum _i k_i$$ k = ∑ i k i and this ratio cannot be improved in the general case. In this paper, motivated by several concrete applications, we consider two ways to improve upon the approximation given by the greedy algorithm. First, we give a bicriteria approximation algorithm for Multi-Submod-Cover that covers each constraint to within a factor of $$(1-1/e-\varepsilon )$$ ( 1 - 1 / e - ε ) while incurring an approximation of $$O(\frac{1}{\epsilon }\log r)$$ O ( 1 ϵ log r ) in the cost. Second, we consider the special case when each $$f_i$$ f i is a obtained from a truncated coverage function and obtain an algorithm that generalizes previous work on partial set cover ( Partial-SC ), covering integer programs ( CIPs ) and multiple vertex cover constraints Bera et al. (Theoret Comput Sci 555:2–8 Bera et al. 2014). Both these algorithms are based on mathematical programming relaxations that avoid the limitations of the greedy algorithm. We demonstrate the implications of our algorithms and related ideas to several applications ranging from geometric covering problems to clustering with outliers. Our work highlights the utility of the high-level model and the lens of submodularity in addressing this class of covering problems. 
    more » « less
  5. We propose subsampling as a unified algorithmic technique for submodular maximization in centralized and online settings. The idea is simple: independently sample elements from the ground set and use simple combinatorial techniques (such as greedy or local search) on these sampled elements. We show that this approach leads to optimal/state-of-the-art results despite being much simpler than existing methods. In the usual off-line setting, we present SampleGreedy, which obtains a [Formula: see text]-approximation for maximizing a submodular function subject to a p-extendible system using [Formula: see text] evaluation and feasibility queries, where k is the size of the largest feasible set. The approximation ratio improves to p + 1 and p for monotone submodular and linear objectives, respectively. In the streaming setting, we present Sample-Streaming, which obtains a [Formula: see text]-approximation for maximizing a submodular function subject to a p-matchoid using O(k) memory and [Formula: see text] evaluation and feasibility queries per element, and m is the number of matroids defining the p-matchoid. The approximation ratio improves to 4p for monotone submodular objectives. We empirically demonstrate the effectiveness of our algorithms on video summarization, location summarization, and movie recommendation tasks. 
    more » « less