skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 16 until 2:00 AM ET on Saturday, May 17 due to maintenance. We apologize for the inconvenience.


Title: Data-Driven Ranking and Selection Under Input Uncertainty
We consider a simulation-based ranking and selection (R&S) problem with input uncertainty, in which unknown input distributions can be estimated using input data arriving in batches of varying sizes over time. Each time a batch arrives, additional simulations can be run using updated input distribution estimates. The goal is to confidently identify the best design after collecting as few batches as possible. We first introduce a moving average estimator for aggregating simulation outputs generated under heterogenous input distributions. Then, based on a sequential elimination framework, we devise two major R&S procedures by establishing exact and asymptotic confidence bands for the estimator. We also extend our procedures to the indifference zone setting, which helps save simulation effort for practical usage. Numerical results show the effectiveness and necessity of our procedures in controlling error from input uncertainty. Moreover, the efficiency can be further boosted through optimizing the “drop rate” parameter, which is the proportion of past simulation outputs to discard, of the moving average estimator.  more » « less
Award ID(s):
2053489
PAR ID:
10421918
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Operations Research
ISSN:
0030-364X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Uncertainty analysis in the outcomes of model predictions is a key element in decision-based material design to establish confidence in the models and evaluate the fidelity of models. Uncertainty Propagation (UP) is a technique to determine model output uncertainties based on the uncertainty in its input variables. The most common and simplest approach to propagate the uncertainty from a model inputs to its outputs is by feeding a large number of samples to the model, known as Monte Carlo (MC) simulation which requires exhaustive sampling from the input variable distributions. However, MC simulations are impractical when models are computationally expensive. In this work, we investigate the hypothesis that while all samples are useful on average, some samples must be more useful than others. Thus, reordering MC samples and propagating more useful samples can lead to enhanced convergence in statistics of interest earlier and thus, reducing the computational burden of UP process. Here, we introduce a methodology to adaptively reorder MC samples and show how it results in reduction of computational expense of UP processes. 
    more » « less
  2. Feng, B.; Pedrielli, G; Peng, Y.; Shashaani, S.; Song, E.; Corlu, C.; Lee, L.; Chew, E.; Roeder, T.; Lendermann, P. (Ed.)
    Ranking & selection (R&S) procedures are simulation-optimization algorithms for making one-time decisions among a finite set of alternative system designs or feasible solutions with a statistical assurance of a good selection. R&S with covariates (R&S+C) extends the paradigm to allow the optimal selection to depend on contextual information that is obtained just prior to the need for a decision. The dominant approach for solving such problems is to employ offline simulation to create metamodels that predict the performance of each system or feasible solution as a function of the covariate. This paper introduces a fundamentally different approach that solves individual R&S problems offline for various values of the covariate, and then treats the real-time decision as a classification problem: given the covariate information, which system is a good solution? Our approach exploits the availability of efficient R&S procedures, requires milder assumptions than the metamodeling paradigm to provide strong guarantees, and can be more efficient. 
    more » « less
  3. null (Ed.)
    Input uncertainty is an aspect of simulation model risk that arises when the driving input distributions are derived or “fit” to real-world, historical data. Although there has been significant progress on quantifying and hedging against input uncertainty, there has been no direct attempt to reduce it via better input modeling. The meaning of “better” depends on the context and the objective: Our context is when (a) there are one or more families of parametric distributions that are plausible choices; (b) the real-world historical data are not expected to perfectly conform to any of them; and (c) our primary goal is to obtain higher-fidelity simulation output rather than to discover the “true” distribution. In this paper, we show that frequentist model averaging can be an effective way to create input models that better represent the true, unknown input distribution, thereby reducing model risk. Input model averaging builds from standard input modeling practice, is not computationally burdensome, requires no change in how the simulation is executed nor any follow-up experiments, and is available on the Comprehensive R Archive Network (CRAN). We provide theoretical and empirical support for our approach. 
    more » « less
  4. Discrete-event simulation models generate random variates from input distributions and compute outputs according to the simulation logic. The input distributions are typically fitted to finite real-world data and thus are subject to estimation errors that can propagate to the simulation outputs: an issue commonly known as input uncertainty (IU). This paper investigates quantifying IU using the output confidence intervals (CIs) computed from bootstrap quantile estimators. The standard direct bootstrap method has overcoverage due to convolution of the simulation error and IU; however, the brute-force way of washing away the former is computationally demanding. We present two new bootstrap methods to enhance direct resampling in both statistical and computational efficiencies using shrinkage strategies to down-scale the variabilities encapsulated in the CIs. Our asymptotic analysis shows how both approaches produce tight CIs accounting for IU under limited input data and simulation effort along with the simulation sample-size requirements relative to the input data size. We demonstrate performances of the shrinkage strategies with several numerical experiments and investigate the conditions under which each method performs well. We also show advantages of nonparametric approaches over parametric bootstrap when the distribution family is misspecified and over metamodel approaches when the dimension of the distribution parameters is high. History: Accepted by Bruno Tuffin, Area Editor for Simulation. Funding: This work was supported by the National Science Foundation [CAREER CMMI-1834710, CAREER CMMI-2045400, DMS-1854659, and IIS-1849280]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.0044 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2022.0044 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ . 
    more » « less
  5. Sequential ranking-and-selection procedures deliver Bayesian guarantees by repeatedly computing a posterior quantity of interest—for example, the posterior probability of good selection or the posterior expected opportunity cost—and terminating when this quantity crosses some threshold. Computing these posterior quantities entails nontrivial numerical computation. Thus, rather than exactly check such posterior-based stopping rules, it is common practice to use cheaply computable bounds on the posterior quantity of interest, for example, those based on Bonferroni’s or Slepian’s inequalities. The result is a conservative procedure that samples more simulation replications than are necessary. We explore how the time spent simulating these additional replications might be better spent computing the posterior quantity of interest via numerical integration, with the potential for terminating the procedure sooner. To this end, we develop several methods for improving the computational efficiency of exactly checking the stopping rules. Simulation experiments demonstrate that the proposed methods can, in some instances, significantly reduce a procedure’s total sample size. We further show these savings can be attained with little added computational effort by making effective use of a Monte Carlo estimate of the posterior quantity of interest. Summary of Contribution: The widespread use of commercial simulation software in industry has made ranking-and-selection (R&S) algorithms an accessible simulation-optimization tool for operations research practitioners. This paper addresses computational aspects of R&S procedures delivering finite-time Bayesian statistical guarantees, primarily the decision of when to terminate sampling. Checking stopping rules entails computing or approximating posterior quantities of interest perceived as being computationally intensive to evaluate. The main results of this paper show that these quantities can be efficiently computed via numerical integration and can yield substantial savings in sampling relative to the prevailing approach of using conservative bounds. In addition to enhancing the performance of Bayesian R&S procedures, the results have the potential to advance other research in this space, including the development of more efficient allocation rules. 
    more » « less