skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Online Convex Optimization with Stochastic Constraints
This paper considers online convex optimization (OCO) with stochastic constraints, which generalizes Zinkevich’s OCO over a known simple fixed set by introducing multiple stochastic functional constraints that are i.i.d. generated at each round and are disclosed to the decision maker only after the decision is made. This formulation arises naturally when decisions are restricted by stochastic environ- ments or deterministic environments with noisy observations. It also includes many important problems as special case, such as OCO with long term constraints, stochastic constrained convex optimization, and deterministic constrained con- vex optimization. To solve this problem, this paper proposes a new algorithm that achieves O(√T ) expected regret and constraint violations and O(√T log(T )) high probability regret and constraint violations. Experiments on a real-world data center scheduling problem further verify the performance of the new algorithm.  more » « less
Award ID(s):
1718477
PAR ID:
10074981
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper considers online convex optimization (OCO) with stochastic constraints, which generalizes Zinkevich’s OCO over a known simple fixed set by introducing multiple stochastic functional constraints that are i.i.d. generated at each round and are disclosed to the decision maker only after the decision is made. This formulation arises naturally when decisions are restricted by stochastic environ- ments or deterministic environments with noisy observations. It also includes many important problems as special case, such as OCO with long term constraints, stochastic constrained convex optimization, and deterministic constrained con- vex optimization. To solve this problem, this paper proposes a new algorithm that achieves O(√T ) expected regret and constraint violations and O(√T log(T )) high probability regret and constraint violations. Experiments on a real-world data center scheduling problem further verify the performance of the new algorithm. 
    more » « less
  2. This paper considers online convex optimization over a complicated constraint set, which typically consists of multiple functional constraints and a set constraint. The conventional online projection algorithm (Zinkevich, 2003) can be difficult to implement due to the potentially high computation complexity of the projection operation. In this paper, we relax the functional constraints by allowing them to be violated at each round but still requiring them to be satis ed in the long term. This type of relaxed online convex optimization (with long term constraints) was first considered in Mahdavi et al. (2012). That prior work proposes an algorithm to achieve O(sqrt(T)) regret and O(T^(3/4)) constraint violations for general problems and another algorithm to achieve an O(T^(2/3)) bound for both regret and constraint violations when the constraint set can be described by a nite number of linear constraints. A recent extension in Jenatton et al. (2016) can achieve O(T^(max(theta, 1-theta)) regret and O(T^(1-theta/2)) constraint violations where theta in (0,1). The current paper proposes a new simple algorithm that yields improved performance in comparison to prior works. The new algorithm achieves an O(sqrt(T)) regret bound with O(1) constraint violations. 
    more » « less
  3. We study the problem of safe online convex optimization, where the action at each time step must satisfy a set of linear safety constraints. The goal is to select a sequence of ac- tions to minimize the regret without violating the safety constraints at any time step (with high probability). The parameters that specify the linear safety constraints are unknown to the algorithm. The algorithm has access to only the noisy observations of constraints for the chosen actions. We pro- pose an algorithm, called the Safe Online Projected Gradient Descent(SO-PGD) algorithm to address this problem. We show that, under the assumption of the availability of a safe baseline action, the SO-PGD algorithm achieves a regret O(T^2/3). While there are many algorithms for online convex optimization (OCO) problems with safety constraints avail- able in the literature, they allow constraint violations during learning/optimization, and the focus has been on characterizing the cumulative constraint violations. To the best of our knowledge, ours is the first work that provides an algorithm with provable guarantees on the regret, without violating the linear safety constraints (with high probability) at any time step. 
    more » « less
  4. null (Ed.)
    This paper considers online convex optimization (OCO) problems where decisions are constrained by available energy resources. A key scenario is optimal power control for an energy harvesting device with a finite capacity battery. The goal is to minimize a time-average loss function while keeping the used energy less than what is available. In this setup, the distribution of the randomly arriving harvestable energy (which is assumed to be i.i.d.) is unknown, the current loss function is unknown, and the controller is only informed by the history of past observations. A prior algorithm is known to achieve $$O(\sqrtT )$$ regret by using a battery with an $$O(\sqrtT )$$ capacity. This paper develops a new algorithm that maintains this asymptotic trade-off with the number of time steps T while improving dependency on the dimension of the decision vector from $$O(\sqrtn )$$ to $$O(\sqrtłog(n) )$$. The proposed algorithm introduces a separation of the decision vector into amplitude and direction components. It uses two distinct types of Bregman divergence, together with energy queue information, to make decisions for each component. 
    more » « less
  5. Abstract We consider the problem of unconstrained online convex optimization (OCO) with sub- exponential noise, a strictly more general problem than the standard OCO. In this setting, the learner receives a subgradient of the loss functions corrupted by sub-exponential noise and strives to achieve optimal regret guarantee, without knowledge of the competitor norm, i.e., in a parameter-free way. Recently, Cutkosky and Boahen (COLT 2017) proved that, given unbounded subgradients, it is impossible to guarantee a sublinear regret due to an exponential penalty. This paper shows that it is possible to go around the lower bound by allowing the observed subgradients to be unbounded via stochastic noise. However, the presence of unbounded noise in unconstrained OCO is challenging; existing algorithms do not provide near-optimal regret bounds or fail to have a guarantee. So, we design a novel parameter-free OCO algorithm for Banach space, which we call BANCO, via a reduction to betting on noisy coins. We show that BANCO achieves the optimal regret rate in our problem. Finally, we show the application of our results to obtain a parameter-free locally private stochastic subgradient descent algorithm, and the connection to the law of iterated logarithms. 
    more » « less