We investigate the problems of maximizing k-submodular functions over total size constraints and over individual size constraints. k-submodularity is a generalization of submodularity beyond just picking items of a ground set, instead associating one of k types to chosen items. For sensor selection problems, for instance, this enables modeling of which type of sensor to put at a location, not simply whether to put a sensor or not. We propose and analyze threshold-greedy algorithms for both types of constraints. We prove that our proposed algorithms achieve the best known approximation ratios for both constraint types, up to a user-chosen parameter that balances computational complexity and the approximation ratio, while only using a number of function evaluations that depends linearly (up to poly-logarithmic terms) on the number of elements n, the number of types k, and the inverse of the user chosen parameter. Other algorithms that achieve the best-known deterministic approximation ratios require a number of function evaluations that depend linearly on the budget B, while our methods do not. We empirically demonstrate our algorithms' performance in applications of sensor placement with k types and influence maximization with k topics.
more »
« less
Batch greedy maximization of non-submodular functions: Guarantees and applications to experimental design
We propose and analyze batch greedy heuristics for cardinality constrained maximization of non-submodular non-decreasing set functions. We consider the standard greedy paradigm, along with its distributed greedy and stochastic greedy variants. Our theoretical guarantees are characterized by the combination of submodularity and supermodularity ratios. We argue how these parameters define tight modular bounds based on incremental gains, and provide a novel reinterpretation of the classical greedy algorithm using the minorize maximize (MM) principle. Based on that analogy, we propose a new class of methods exploiting any plausible modular bound. In the context of optimal experimental design for linear Bayesian inverse problems, we bound the submodularity and supermodularity ratios when the underlying objective is based on mutual information. We also develop novel modular bounds for the mutual information in this setting, and describe certain connections to polyhedral combinatorics. We discuss how algorithms using these modular bounds relate to established statistical notions such as leverage scores and to more recent efforts such as volume sampling. We demonstrate our theoretical findings on synthetic problems and on a real-world climate monitoring example.
more »
« less
- Award ID(s):
- 1723011
- PAR ID:
- 10548505
- Publisher / Repository:
- Journal of Machine Learning Research
- Date Published:
- Journal Name:
- Journal of machine learning research
- Volume:
- 22
- Issue:
- 252
- ISSN:
- 1533-7928
- Page Range / eLocation ID:
- 1-62
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Chaudhuri, K (Ed.)How can we efficiently gather information to optimize an unknown function, when presented with multiple, mutually dependent information sources with different costs? For example, when optimizing a physical system, intelligently trading off computer simulations and real-world tests can lead to significant savings. Existing multi-fidelity Bayesian optimization methods, such as multi-fidelity GP-UCB or Entropy Search-based approaches, either make simplistic assumptions on the interaction among different fidelities or use simple heuristics that lack theoretical guarantees. In this paper, we study multifidelity Bayesian optimization with complex structural dependencies among multiple outputs, and propose MF-MI-Greedy, a principled algorithmic framework for addressing this problem. In particular, we model different fidelities using additive Gaussian processes based on shared latent relationships with the target function. Then we use cost-sensitive mutual information gain for efficient Bayesian optimization. We propose a simple notion of regret which incorporates the varying cost of different fidelities, and prove that MF-MI-Greedy achieves low regret. We demonstrate the strong empirical performance of our algorithm on both synthetic and real-world datasets.more » « less
-
Network sampling is the task of selecting a subset of nodes and links from a network in a way that preserves its topological properties and other user requirements. This paper investigates the problem of generating an unbiased network sample that contains balanced proportion of nodes from different groups. Creating such a representative sample would require handling the trade-off between ensuring structural preservability and group representativity of the selected nodes. We present a novel max-min subgraph fairness measure that can be used as a unifying framework to combine both criteria. A greedy algorithm is then proposed to generate a fair and representative sample from an initial set of target nodes. A theoretical approximation guarantee for the output of the proposed greedy algorithm based on submodularity and curvature ratios is also presented. Experimental results on real-world datasets show that the proposed method will generate more fair and representative samples compared to other existing network sampling methods.more » « less
-
We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI). Contrary to other CMI bounds, which are black-box bounds that do not exploit the structure of the problem and may be hard to evaluate in practice, our loo-CMI bounds can be computed easily and can be interpreted in connection to other notions such as classical leave-one-out cross-validation, stability of the optimization algorithm, and the geometry of the loss-landscape. It applies both to the output of training algorithms as well as their predictions. We empirically validate the quality of the bound by evaluating its predicted generalization gap in scenarios for deep learning. In particular, our bounds are non-vacuous on large-scale image-classification tasks.more » « less
-
null (Ed.)Submodular function minimization is well studied, and existing algorithms solve it exactly or up to arbitrary accuracy. However, in many applications, such as structured sparse learning or batch Bayesian optimization, the objective function is not exactly submodular, but close. In this case, no theoretical guarantees exist. Indeed, submodular minimization algorithms rely on intricate connections between submodularity and convexity. We show how these relations can be extended to obtain approximation guarantees for minimizing non-submodular functions, characterized by how close the function is to submodular. We also extend this result to noisy function evaluations. Our approximation results are the first for minimizing non-submodular functions, and are optimal, as established by our matching lower bound.more » « less
An official website of the United States government

