We investigate the problem of unconstrained combinatorial multi-armed bandits with fullbandit feedback and stochastic rewards for submodular maximization. Previous works investigate the same problem assuming a submodular and monotone reward function. In this work, we study a more general problem, i.e., when the reward function is not necessarily monotone, and the submodularity is assumed only in expectation. We propose Randomized Greedy Learning (RGL) algorithm and theoretically prove that it achieves a 1 2 -regret upper bound of O˜(nT 2 3 ) for horizon T and number of arms n. We also show in experiments that RGL empirically outperforms other full-bandit variants in submodular and non-submodular settings. 
                        more » 
                        « less   
                    
                            
                            On the Unreasonable Effectiveness of the Greedy Algorithm: Greedy Adapts to Sharpness
                        
                    
    
            It is well known that the standard greedy algorithm guarantees a worst-case approximation factor of 1 − 1/e when maximizing a monotone submodular function under a cardinality constraint. However, empirical studies show that its performance is substantially better in practice. This raises a natural question of explaining this improved performance of the greedy algorithm. In this work, we define sharpness for submodular functions as a candidate explanation for this phenomenon. We show that the greedy algorithm provably performs better as the sharpness of the submodular function increases. This improvement ties in closely with the faster convergence rates of first order methods for sharp functions in convex optimization. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1717947
- PAR ID:
- 10178884
- Date Published:
- Journal Name:
- Thirty-seventh International Conference on Machine Learning
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            We investigate the problem of unconstrained combinatorial multi-armed bandits with full-bandit feedback and stochastic rewards for submodular maximization. Previous works investigate the same problem assuming a submodular and monotone reward function. In this work, we study a more general problem, i.e., when the reward function is not necessarily monotone, and the submodularity is assumed only in expectation. We propose Randomized Greedy Learning (RGL) algorithm and theoretically prove that it achieves a $$\frac{1}{2}$$-regret upper bound of $$\Tilde{\mathcal{O}}(n T^{\frac{2}{3}})$$ for horizon $$T$$ and number of arms $$n$$. We also show in experiments that RGL empirically outperforms other full-bandit variants in submodular and non-submodular settings.more » « less
- 
            Bae, Sang Won; Park, Heejin (Ed.)We consider the problem of solving the Min-Sum Submodular Cover problem using local search. The Min-Sum Submodular Cover problem generalizes the NP-complete Min-Sum Set Cover problem, replacing the input set cover instance with a monotone submodular set function. A simple greedy algorithm achieves an approximation factor of 4, which is tight unless P=NP [Streeter and Golovin, NeurIPS, 2008]. We complement the greedy algorithm with analysis of a local search algorithm. Building on work of Munagala et al. [ICDT, 2005], we show that, using simple initialization, a straightforward local search algorithm achieves a (4+ε)-approximate solution in time O(n³log(n/ε)), provided that the monotone submodular set function is also second-order supermodular. Second-order supermodularity has been shown to hold for a number of submodular functions of practical interest, including functions associated with set cover, matching, and facility location. We present experiments on two special cases of Min-Sum Submodular Cover and find that the local search algorithm can outperform the greedy algorithm on small data sets.more » « less
- 
            We propose a novel combinatorial stochastic-greedy bandit (SGB) algorithm for combinatorial multi-armed bandit problems when no extra information other than the joint reward of the selected set of n arms at each time step t in [T] is observed. SGB adopts an optimized stochastic-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms. Unlike existing methods that explore the entire set of unselected base arms during each selection step, our SGB algorithm samples only an optimized proportion of unselected arms and selects actions from this subset. We prove that our algorithm achieves a (1-1/e)-regret bound of O(n^(1/3) k^(2/3) T^(2/3) log(T)^(2/3)) for monotone stochastic submodular rewards, which outperforms the state-of-the-art in terms of the cardinality constraint k. Furthermore, we empirically evaluate the performance of our algorithm in the context of online constrained social influence maximization. Our results demonstrate that our proposed approach consistently outperforms the other algorithms, increasing the performance gap as k grows.more » « less
- 
            We propose and analyze batch greedy heuristics for cardinality constrained maximization of non-submodular non-decreasing set functions. We consider the standard greedy paradigm, along with its distributed greedy and stochastic greedy variants. Our theoretical guarantees are characterized by the combination of submodularity and supermodularity ratios. We argue how these parameters define tight modular bounds based on incremental gains, and provide a novel reinterpretation of the classical greedy algorithm using the minorize maximize (MM) principle. Based on that analogy, we propose a new class of methods exploiting any plausible modular bound. In the context of optimal experimental design for linear Bayesian inverse problems, we bound the submodularity and supermodularity ratios when the underlying objective is based on mutual information. We also develop novel modular bounds for the mutual information in this setting, and describe certain connections to polyhedral combinatorics. We discuss how algorithms using these modular bounds relate to established statistical notions such as leverage scores and to more recent efforts such as volume sampling. We demonstrate our theoretical findings on synthetic problems and on a real-world climate monitoring example.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    