This content will become publicly available on January 1, 2024
- Award ID(s):
- 2149617
- Publication Date:
- NSF-PAR ID:
- 10397870
- Journal Name:
- Proceedings of the International Workshop on Artificial Intelligence and Statistics
- ISSN:
- 1525-531X
- Sponsoring Org:
- National Science Foundation
More Like this
-
We investigate the problem of unconstrained combinatorial multi-armed bandits with fullbandit feedback and stochastic rewards for submodular maximization. Previous works investigate the same problem assuming a submodular and monotone reward function. In this work, we study a more general problem, i.e., when the reward function is not necessarily monotone, and the submodularity is assumed only in expectation. We propose Randomized Greedy Learning (RGL) algorithm and theoretically prove that it achieves a 1 2 -regret upper bound of O˜(nT 2 3 ) for horizon T and number of arms n. We also show in experiments that RGL empirically outperforms other full-bandit variants in submodular and non-submodular settings.
-
Cussens, James ; Zhang, Kun (Ed.)We investigate the problem of combinatorial multi-armed bandits with stochastic submodular (in expectation) rewards and full-bandit feedback, where no extra information other than the reward of selected action at each time step $t$ is observed. We propose a simple algorithm, Explore-Then-Commit Greedy (ETCG) and prove that it achieves a $(1-1/e)$-regret upper bound of $\mathcal{O}(n^\frac{1}{3}k^\frac{4}{3}T^\frac{2}{3}\log(T)^\frac{1}{2})$ for a horizon $T$, number of base elements $n$, and cardinality constraint $k$. We also show in experiments with synthetic and real-world data that the ETCG empirically outperforms other full-bandit methods.
-
In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. Each agent makes a sequence of decisions on selecting an arm from M candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution with which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm \textttGossip\_UCB, which is a coupling of variants of both the classical gossiping algorithm and the celebrated Upper Confidence Bound (UCB) bandit algorithm. We show that \textttGossip\_UCB successfully adapts local bandit learning into a global gossiping process for sharing information among connected agents, and achieves guaranteed regret at the order of O(\max\ \textttpoly (N,M) łog T, \textttpoly (N,M)łog_łambda_2^-1 N\ ) for all N agents, wheremore »
-
Abstract We consider the problem of covering multiple submodular constraints. Given a finite ground set
N , a weight function ,$$w: N \rightarrow \mathbb {R}_+$$ r monotone submodular functions over$$f_1,f_2,\ldots ,f_r$$ N and requirements the goal is to find a minimum weight subset$$k_1,k_2,\ldots ,k_r$$ such that$$S \subseteq N$$ for$$f_i(S) \ge k_i$$ . We refer to this problem as$$1 \le i \le r$$ Multi-Submod-Cover and it was recently considered by Har-Peled and Jones (Few cuts meet many point sets. CoRR.arxiv:abs1808.03260 Har-Peled and Jones 2018) who were motivated by an application in geometry. Even with$$r=1$$ Multi-Submod-Cover generalizes the well-known Submodular Set Cover problem (Submod-SC ), and it can also be easily reduced toSubmod-SC . A simple greedy algorithm gives an approximation where$$O(\log (kr))$$ and this ratio cannot be improved in the general case. In this paper, motivated by several concrete applications, we consider two ways to improve upon the approximation given by the greedy algorithm. First, we give a bicriteria approximation algorithm for$$k = \sum _i k_i$$ Multi-Submod-Cover that covers each constraint to within a factor of while incurring an approximation of$$(1-1/e-\varepsilon )$$ in the cost. Second, we consider the special case when each$$O(\frac{1}{\epsilon }\log r)$$ is a obtained from a truncated coverage function and obtain an algorithm that generalizes previous work on partial set cover ($$f_i$$ Partial-SC ), covering integer programs (CIPs ) and multiple vertex cover constraintsmore » -
We study the problem of maximizing a non-monotone submodular function subject to a cardinality constraint in the streaming model. Our main contribution is a single-pass (semi-)streaming algorithm that uses roughly $O(k / \varepsilon^2)$ memory, where $k$ is the size constraint. At the end of the stream, our algorithm post-processes its data structure using any offline algorithm for submodular maximization, and obtains a solution whose approximation guarantee is $\frac{\alpha}{1+\alpha}-\varepsilon$, where $\alpha$ is the approximation of the offline algorithm. If we use an exact (exponential time) post-processing algorithm, this leads to $\frac{1}{2}-\varepsilon$ approximation (which is nearly optimal). If we post-process with the algorithm of \cite{buchbinder2019constrained}, that achieves the state-of-the-art offline approximation guarantee of $\alpha=0.385$, we obtain $0.2779$-approximation in polynomial time, improving over the previously best polynomial-time approximation of $0.1715$ due to \cite{feldman2018less}. It is also worth mentioning that our algorithm is combinatorial and deterministic, which is rare for an algorithm for non-monotone submodular maximization, and enjoys a fast update time of $O(\frac{\log k + \log (1/\alpha {\varepsilon^2})$ per element.