skip to main content

Title: Uniform Loss Algorithms for Online Stochastic Decision-Making With Applications to Bin Packing
We consider a general class of finite-horizon online decision-making problems, where in each period a controller is presented a stochastic arrival and must choose an action from a set of permissible actions, and the final objective depends only on the aggregate type-action counts. Such a framework encapsulates many online stochastic variants of common optimization problems including bin packing, generalized assignment, and network revenue management. In such settings, we study a natural model-predictive control algorithm that in each period, acts greedily based on an updated certainty-equivalent optimization problem. We introduce a simple, yet general, condition under which this algorithm obtains uniform additive loss (independent of the horizon) compared to an optimal solution with full knowledge of arrivals. Our condition is fulfilled by the above-mentioned problems, as well as more general settings involving piece-wise linear objectives and offline index policies, including an airline overbooking problem.
Authors:
;
Award ID(s):
1847393 1955997 1839346
Publication Date:
NSF-PAR ID:
10240309
Journal Name:
ACM SIGMETRICS Performance Evaluation Review
Volume:
48
Issue:
1
Page Range or eLocation-ID:
1 to 2
ISSN:
0163-5999
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider an online optimization problem in which the reward functions are DR-submodular, and in addition to maximizing the total reward, the sequence of decisions must satisfy some convex constraints on average. Specifically, at each round t, upon committing to an action x_t, a DR-submodular utility function f_t and a convex constraint function g_t are revealed, and the goal is to maximize the overall utility while ensuring the average of the constraint functions is non-positive (so constraints are satisfied on average). Such cumulative constraints arise naturally in applications where the average resource consumption is required to remain below a specified threshold. We study this problem under an adversarial model and a stochastic model for the convex constraints, where the functions g_t can vary arbitrarily or according to an i.i.d. process over time. We propose a single algorithm which achieves sub-linear regret (with respect to the time horizon T) as well as sub-linear constraint violation bounds in both settings, without prior knowledge of the regime. Prior works have studied this problem in the special case of linear constraint functions. Our results not only improve upon the existing bounds under linear cumulative constraints, but also give the first sub-linear bounds for generalmore »convex long-term constraints.« less
  2. Minimax optimal convergence rates for numerous classes of stochastic convex optimization problems are well characterized, where the majority of results utilize iterate averaged stochastic gradient descent (SGD) with polynomially decaying step sizes. In contrast, the behavior of SGDs final iterate has received much less attention despite the widespread use in practice. Motivated by this observation, this work provides a detailed study of the following question: what rate is achievable using the final iterate of SGD for the streaming least quares regression problem with and without strong convexity? First, this work shows that even if the time horizon T (i.e. the number of iterations that SGD is run for) is known in advance, the behavior of SGDs final iterate with any polynomially decaying learning rate scheme is highly suboptimal compared to the statistical minimax rate (by a condition number factor in the strongly convex case and a factor of \sqrt{T} in the non-strongly convex case). In contrast, this paper shows that Step Decay schedules, which cut the learning rate by a constant factor every constant number of epochs (i.e., the learning rate decays geometrically) offer significant improvements over any polynomially decaying step size schedule. In particular, the behavior of the finalmore »iterate with step decay schedules is off from the statistical minimax rate by only log factors (in the condition number for the strongly convex case, and in T in the non-strongly convex case). Finally, in stark contrast to the known horizon case, this paper shows that the anytime (i.e. the limiting) behavior of SGDs final iterate is poor (in that it queries iterates with highly sub-optimal function value infinitely often, i.e. in a limsup sense) irrespective of the step size scheme employed. These results demonstrate the subtlety in establishing optimal learning rate schedules (for the final iterate) for stochastic gradient procedures in fixed time horizon settings.« less
  3. Minimax optimal convergence rates for classes of stochastic convex optimization problems are well characterized, where the majority of results utilize iterate averaged stochastic gradient descent (SGD) with polynomially decaying step sizes. In contrast, SGD's final iterate behavior has received much less attention despite their widespread use in practice. Motivated by this observation, this work provides a detailed study of the following question: what rate is achievable using the final iterate of SGD for the streaming least squares regression problem with and without strong convexity? First, this work shows that even if the time horizon T (i.e. the number of iterations SGD is run for) is known in advance, SGD's final iterate behavior with any polynomially decaying learning rate scheme is highly sub-optimal compared to the minimax rate (by a condition number factor in the strongly convex case and a factor of T‾‾√ in the non-strongly convex case). In contrast, this paper shows that Step Decay schedules, which cut the learning rate by a constant factor every constant number of epochs (i.e., the learning rate decays geometrically) offers significant improvements over any polynomially decaying step sizes. In particular, the final iterate behavior with a step decay schedule is off the minimaxmore »rate by only log factors (in the condition number for strongly convex case, and in T for the non-strongly convex case). Finally, in stark contrast to the known horizon case, this paper shows that the anytime (i.e. the limiting) behavior of SGD's final iterate is poor (in that it queries iterates with highly sub-optimal function value infinitely often, i.e. in a limsup sense) irrespective of the stepsizes employed. These results demonstrate the subtlety in establishing optimal learning rate schemes (for the final iterate) for stochastic gradient procedures in fixed time horizon settings.« less
  4. Minimax optimal convergence rates for numerous classes of stochastic convex optimization problems are well characterized, where the majority of results utilize iterate averaged stochastic gradient descent (SGD) with polynomially decaying step sizes. In contrast, the behavior of SGD’s final iterate has received much less attention despite the widespread use in practice. Motivated by this observation, this work provides a detailed study of the following question: what rate is achievable using the final iterate of SGD for the streaming least squares regression problem with and without strong convexity? First, this work shows that even if the time horizon T (i.e. the number of iterations that SGD is run for) is known in advance, the behavior of SGD’s final iterate with any polynomially decaying learning rate scheme is highly sub-optimal compared to the statistical minimax rate (by a condition number factor in the strongly convex √ case and a factor of shows that Step Decay schedules, which cut the learning rate by a constant factor every constant number of epochs (i.e., the learning rate decays geometrically) offer significant improvements over any polynomially decaying step size schedule. In particular, the behavior of the final iterate with step decay schedules is off from themore »statistical minimax rate by only log factors (in the condition number for the strongly convex case, and in T in the non-strongly convex case). Finally, in stark contrast to the known horizon case, this paper shows that the anytime (i.e. the limiting) behavior of SGD’s final iterate is poor (in that it queries iterates with highly sub-optimal function value infinitely often, i.e. in a limsup sense) irrespective of the stepsize scheme employed. These results demonstrate the subtlety in establishing optimal learning rate schedules (for the final iterate) for stochastic gradient procedures in fixed time horizon settings.« less
  5. Motivated by connected and automated vehicle (CAV) technologies, this paper proposes a data-driven optimization-based Model Predictive Control (MPC) modeling framework for the Cooperative Adaptive Cruise Control (CACC) of a string of CAVs under uncertain traffic conditions. The proposed data-driven optimization-based MPC modeling framework aims to improve the stability, robustness, and safety of longitudinal cooperative automated driving involving a string of CAVs under uncertain traffic conditions using Vehicle-to-Vehicle (V2V) data. Based on an online learning-based driving dynamics prediction model, we predict the uncertain driving states of the vehicles preceding the controlled CAVs. With the predicted driving states of the preceding vehicles, we solve a constrained Finite-Horizon Optimal Control problem to predict the uncertain driving states of the controlled CAVs. To obtain the optimal acceleration or deceleration commands for the CAVs under uncertainties, we formulate a Distributionally Robust Stochastic Optimization (DRSO) model (i.e. a special case of data-driven optimization models under moment bounds) with a Distributionally Robust Chance Constraint (DRCC). The predicted uncertain driving states of the immediately preceding vehicles and the controlled CAVs will be utilized in the safety constraint and the reference driving states of the DRSO-DRCC model. To solve the minimax program of the DRSO-DRCC model, we reformulate themore »relaxed dual problem as a Semidefinite Program (SDP) of the original DRSO-DRCC model based on the strong duality theory and the Semidefinite Relaxation technique. In addition, we propose two methods for solving the relaxed SDP problem. We use Next Generation Simulation (NGSIM) data to demonstrate the proposed model in numerical experiments. The experimental results and analyses demonstrate that the proposed model can obtain string-stable, robust, and safe longitudinal cooperative automated driving control of CAVs by proper settings, including the driving-dynamics prediction model, prediction horizon lengths, and time headways. Computational analyses are conducted to validate the efficiency of the proposed methods for solving the DRSO-DRCC model for real-time automated driving applications within proper settings.« less