skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Exploring the Effect of Clustering Algorithms on Sample Average Approximation
Stochastic Programming (SP) is used in disaster management, supply chain design, and other complex problems. Many of the real-world problems that SP is applied to produce large-size models. It is important but challenging that they are optimized quickly and efficiently. Existing optimization algorithms are limited in capability of solving these larger problems. Sample Average Approximation (SAA) method is a common approach for solving large scale SP problems by using the Monte Carlo simulation. This paper focuses on applying clustering algorithms to the data before the random sample is selected for the SAA algorithm. Once clustered, a sample is randomly selected from each of the clusters instead of from the entire dataset. This project looks to analyze five clustering techniques compared to each other and compared to the original SAA algorithm in order to see if clustering improves both the speed and the optimal solution of the SAA method for solving stochastic optimization problems.  more » « less
Award ID(s):
1948159
PAR ID:
10230742
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2021 Institute of Industrial and Systems Engineers (IISE) Annual Conference & Expo
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Stochastic decomposition (SD) has been a computationally effective approach to solve large-scale stochastic programming (SP) problems arising in practical applications. By using incremental sampling, this approach is designed to discover an appropriate sample size for a given SP instance, thus precluding the need for either scenario reduction or arbitrary sample sizes to create sample average approximations (SAA). When compared with the solutions obtained using the SAA procedure, SD provides solutions of similar quality in far less computational time using ordinarily available computational resources. However, previous versions of SD were not applicable to problems with randomness in second-stage cost coefficients. In this paper, we extend its capabilities by relaxing this assumption on cost coefficients in the second stage. In addition to the algorithmic enhancements necessary to achieve this, we also present the details of implementing these extensions, which preserve the computational edge of SD. Finally, we illustrate the computational results obtained from the latest implementation of SD on a variety of test instances generated for problems from the literature. We compare these results with those obtained from the regularized L-shaped method applied to the SAA function of these problems with different sample sizes. 
    more » « less
  2. null (Ed.)
    Managing large-scale systems often involves simultaneously solving thousands of unrelated stochastic optimization problems, each with limited data. Intuition suggests that one can decouple these unrelated problems and solve them separately without loss of generality. We propose a novel data-pooling algorithm called Shrunken-SAA that disproves this intuition. In particular, we prove that combining data across problems can outperform decoupling, even when there is no a priori structure linking the problems and data are drawn independently. Our approach does not require strong distributional assumptions and applies to constrained, possibly nonconvex, nonsmooth optimization problems such as vehicle-routing, economic lot-sizing, or facility location. We compare and contrast our results to a similar phenomenon in statistics (Stein’s phenomenon), highlighting unique features that arise in the optimization setting that are not present in estimation. We further prove that, as the number of problems grows large, Shrunken-SAA learns if pooling can improve upon decoupling and the optimal amount to pool, even if the average amount of data per problem is fixed and bounded. Importantly, we highlight a simple intuition based on stability that highlights when and why data pooling offers a benefit, elucidating this perhaps surprising phenomenon. This intuition further suggests that data pooling offers the most benefits when there are many problems, each of which has a small amount of relevant data. Finally, we demonstrate the practical benefits of data pooling using real data from a chain of retail drug stores in the context of inventory management. This paper was accepted by Chung Piaw Teo, Special Issue on Data-Driven Prescriptive Analytics. 
    more » « less
  3. Decision-making in multi-player games can be extremely challenging, particularly under uncertainty. In this work, we propose a new sample-based approximation to a class of stochastic, general-sum, pure Nash games, where each player has an expected-value objective and a set of chance constraints. This new approximation scheme inherits the accuracy of objective approximation from the established sample average approximation (SAA) method and enjoys a feasibility guarantee derived from the scenario optimization literature. We characterize the sample complexity of this new game-theoretic approximation scheme, and observe that high accuracy usually requires a large number of samples, which results in a large number of sampled constraints. To accommodate this, we decompose the approximated game into a set of smaller games with few constraints for each sampled scenario, and propose a decentralized, consensus-based ADMM algorithm to efficiently compute a generalized Nash equilibrium (GNE) of the approximated game. We prove the convergence of our algorithm to a GNE and empirically demonstrate superior performance relative to a recent baseline algorithm based on ADMM and interior point method. 
    more » « less
  4. Safe control designs for robotic systems remain challenging because of the difficulties of explicitly solving optimal control with nonlinear dynamics perturbed by stochastic noise. However, recent technological advances in computing devices enable online optimization or sampling-based methods to solve control problems. For example, Control Barrier Functions (CBFs) have been proposed to numerically solve convex optimization problems that ensure the control input to stay in the safe set. Model Predictive Path Integral (MPPI) control uses forward sampling of stochastic differential equations to solve optimal control problems online. Both control algorithms are widely used for nonlinear systems because they avoid calculating the derivatives of the nonlinear dynamic functions. In this paper, we use Stochastic Control Barrier Functions (SCBFs) constraints to limit sample regions in the samplingbased algorithm, ensuring safety in a probabilistic sense and improving sample efficiency with a stochastic differential equation. We also show that our algorithm needs fewer samples than the original MPPI algorithm does by providing a sampling complexity analysis. 
    more » « less
  5. N/A (Ed.)
    Abstract This work unifies the analysis of various randomized methods for solving linear and nonlinear inverse problems with Gaussian priors by framing the problem in a stochastic optimization setting. By doing so, we show that many randomized methods are variants of a sample average approximation (SAA). More importantly, we are able to prove a single theoretical result that guarantees the asymptotic convergence for a variety of randomized methods. Additionally, viewing randomized methods as an SAA enables us to prove, for the first time, a single non-asymptotic error result that holds for randomized methods under consideration. Another important consequence of our unified framework is that it allows us to discover new randomization methods. We present various numerical results for linear, nonlinear, algebraic, and PDE-constrained inverse problems that verify the theoretical convergence results and provide a discussion on the apparently different convergence rates and the behavior for various randomized methods. 
    more » « less