skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Reducing Simulation Input-Model Risk via Input Model Averaging
Input uncertainty is an aspect of simulation model risk that arises when the driving input distributions are derived or “fit” to real-world, historical data. Although there has been significant progress on quantifying and hedging against input uncertainty, there has been no direct attempt to reduce it via better input modeling. The meaning of “better” depends on the context and the objective: Our context is when (a) there are one or more families of parametric distributions that are plausible choices; (b) the real-world historical data are not expected to perfectly conform to any of them; and (c) our primary goal is to obtain higher-fidelity simulation output rather than to discover the “true” distribution. In this paper, we show that frequentist model averaging can be an effective way to create input models that better represent the true, unknown input distribution, thereby reducing model risk. Input model averaging builds from standard input modeling practice, is not computationally burdensome, requires no change in how the simulation is executed nor any follow-up experiments, and is available on the Comprehensive R Archive Network (CRAN). We provide theoretical and empirical support for our approach.  more » « less
Award ID(s):
1634982
PAR ID:
10201257
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
INFORMS journal on computing
ISSN:
1526-5528
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This is the first paper to approach the problem of bias in the output of a stochastic simulation due to us- ing input distributions whose parameters were estimated from real-world data. We consider, in particular, the bias in simulation-based estimators of the expected value (long-run average) of the real-world system performance; this bias will be present even if one employs unbiased estimators of the input distribution parameters due to the (typically) nonlinear relationship between these parameters and the output response. To date this bias has been assumed to be negligible because it decreases rapidly as the quantity of real-world input data increases. While true asymptotically, this property does not imply that the bias is actually small when, as is always the case, data are finite. We present a delta-method approach to bias estimation that evaluates the nonlinearity of the expected-value performance surface as a function of the input-model parameters. Since this response surface is unknown, we propose an innovative experimental design to fit a response-surface model that facilitates a test for detecting a bias of a relevant size with specified power. We evaluate the method using controlled experiments, and demonstrate it through a realistic case study concerning a healthcare call centre. 
    more » « less
  2. N. Mustafee, N; Bae, K.-H.G.; Lazarova-Molnar, Rabe; Szabo, C; Haas, P; Son, Y-J (Ed.)
    Simple question: How sensitive is your simulation output to the variance of your simulation input models? Unfortunately, the answer is not simple because the variance of many standard parametric input distributions can achieve the same change in multiple ways as a function of the parameters. In this paper we propose a family of output-mean-with-respect-to-input-variance sensitivity measures and identify two particularly useful members of it. A further benefit of this family is that there is a straightforward estimator of any member with no additional simulation effort beyond the nominal experiment. A numerical example is provided to illustrate the method and interpretation of results. 
    more » « less
  3. Rather than the standard practice of selecting a single “best-fit” distribution from a candidate set, frequentist model averaging (FMA) forms a mixture distribution that is a weighted average of the candidate distributions with the weights tuned by cross-validation. In previous work we showed theoretically and empirically that FMA in the probability space leads to higher fidelity input distributions. In this paper we show that FMA can also be implemented in the quantile space, leading to fits that emphasize tail behavior. We also describe an R package for FMA that is easy to use and available for download. 
    more » « less
  4. Discrete-event simulation models generate random variates from input distributions and compute outputs according to the simulation logic. The input distributions are typically fitted to finite real-world data and thus are subject to estimation errors that can propagate to the simulation outputs: an issue commonly known as input uncertainty (IU). This paper investigates quantifying IU using the output confidence intervals (CIs) computed from bootstrap quantile estimators. The standard direct bootstrap method has overcoverage due to convolution of the simulation error and IU; however, the brute-force way of washing away the former is computationally demanding. We present two new bootstrap methods to enhance direct resampling in both statistical and computational efficiencies using shrinkage strategies to down-scale the variabilities encapsulated in the CIs. Our asymptotic analysis shows how both approaches produce tight CIs accounting for IU under limited input data and simulation effort along with the simulation sample-size requirements relative to the input data size. We demonstrate performances of the shrinkage strategies with several numerical experiments and investigate the conditions under which each method performs well. We also show advantages of nonparametric approaches over parametric bootstrap when the distribution family is misspecified and over metamodel approaches when the dimension of the distribution parameters is high. History: Accepted by Bruno Tuffin, Area Editor for Simulation. Funding: This work was supported by the National Science Foundation [CAREER CMMI-1834710, CAREER CMMI-2045400, DMS-1854659, and IIS-1849280]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.0044 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2022.0044 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ . 
    more » « less
  5. We consider a simulation-based ranking and selection (R&S) problem with input uncertainty, in which unknown input distributions can be estimated using input data arriving in batches of varying sizes over time. Each time a batch arrives, additional simulations can be run using updated input distribution estimates. The goal is to confidently identify the best design after collecting as few batches as possible. We first introduce a moving average estimator for aggregating simulation outputs generated under heterogenous input distributions. Then, based on a sequential elimination framework, we devise two major R&S procedures by establishing exact and asymptotic confidence bands for the estimator. We also extend our procedures to the indifference zone setting, which helps save simulation effort for practical usage. Numerical results show the effectiveness and necessity of our procedures in controlling error from input uncertainty. Moreover, the efficiency can be further boosted through optimizing the “drop rate” parameter, which is the proportion of past simulation outputs to discard, of the moving average estimator. 
    more » « less