skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Estimating Sensitivity to Input Model Variance
Simple question: How sensitive is your simulation output to the variance of your simulation input models? Unfortunately, the answer is not simple because the variance of many standard parametric input distributions can achieve the same change in multiple ways as a function of the parameters. In this paper we propose a family of output-mean-with-respect-to-input-variance sensitivity measures and identify two particularly useful members of it. A further benefit of this family is that there is a straightforward estimator of any member with no additional simulation effort beyond the nominal experiment. A numerical example is provided to illustrate the method and interpretation of results.  more » « less
Award ID(s):
1634982
PAR ID:
10201253
Author(s) / Creator(s):
; ;
Editor(s):
N. Mustafee, N; Bae, K.-H.G.; Lazarova-Molnar, Rabe; Szabo, C; Haas, P; Son, Y-J
Date Published:
Journal Name:
Proceedings of the 2019 Winter Simulation Conference
Page Range / eLocation ID:
3705-3716
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In stochastic simulation, input uncertainty refers to the output variability arising from the statistical noise in specifying the input models. This uncertainty can be measured by a variance contribution in the output, which, in the nonparametric setting, is commonly estimated via the bootstrap. However, due to the convolution of the simulation noise and the input noise, the bootstrap consists of a two-layer sampling and typically requires substantial simulation effort. This paper investigates a subsampling framework to reduce the required effort, by leveraging the form of the variance and its estimation error in terms of the data size and the sampling requirement in each layer. We show how the total required effort can be reduced from an order bigger than the data size in the conventional approach to an order independent of the data size in subsampling. We explicitly identify the procedural specifications in our framework that guarantee relative consistency in the estimation and the corresponding optimal simulation budget allocations. We substantiate our theoretical results with numerical examples. 
    more » « less
  2. This dataset holds 1036 ternary phase diagrams and how points on the diagram phase separate if they do. The data is provided as a serialized object using the `pickle' Python module. The data was compiled using Python version 3.8.  ReferencesThe specific applications and analyses of the data are described in 1.  Dhamankar, S.; Jiang, S.; Webb, M.A. "Accelerating Multicomponent Phase-Coexistence Calculations with Physics-informed Neural Networks" UsageTo access the data in the .pickle file, users can execute the following: # LOAD SIMULATION DATADATA_DIR = "your/custom/dir/" filename = os.path.join(DATA_DIR, f"data_clean.pickle")with open(filename, "rb") as handle:    (x, y_c, y_r, phase_idx, num_phase, max_phase) = pickle.load(handle) x: Input x = (χ_AB, χ_BC, χ_AC, v_A, v_B, v_C, φ_A, φ_B) ∈ ℝ^8. y_c: Output one-hot encoded classification vector y_c ∈ ℝ^3. y_r: Output equilibrium composition and abundance vector y_r = (φ_A^α, φ_B^α, φ_A^β, φ_B^β, φ_A^γ, φ_B^γ, w^α, w^β, w^γ) ∈ ℝ^9. phase_idx: A single integer indicating which unique phase system it belongs to. num_phase: A single integer indicates the number of equilibrium phases the input splits into. max_phase: A single integer indicates the maximum number of equilibrium phases the system splits into. Help, Suggestions, Corrections?If you need help, have suggestions, identify issues, or have corrections, please send your comments to Shengli Jiang at sj0161@princeton.edu GitHubAdditional data and code relevant for this study is additionally accessible at hthttps://github.com/webbtheosim/ml-ternary-phase 
    more » « less
  3. We study the problem of classifier derandomization in machine learning: given a stochastic binary classifier f:X→[0,1], sample a deterministic classifier f̂ :X→{0,1} that approximates the output of f in aggregate over any data distribution. Recent work revealed how to efficiently derandomize a stochastic classifier with strong output approximation guarantees, but at the cost of individual fairness -- that is, if f treated similar inputs similarly, f̂ did not. In this paper, we initiate a systematic study of classifier derandomization with metric fairness guarantees. We show that the prior derandomization approach is almost maximally metric-unfair, and that a simple ``random threshold'' derandomization achieves optimal fairness preservation but with weaker output approximation. We then devise a derandomization procedure that provides an appealing tradeoff between these two: if f is α-metric fair according to a metric d with a locality-sensitive hash (LSH) family, then our derandomized f̂ is, with high probability, O(α)-metric fair and a close approximation of f. We also prove generic results applicable to all (fair and unfair) classifier derandomization procedures, including a bias-variance decomposition and reductions between various notions of metric fairness. 
    more » « less
  4. Numerical simulations have revolutionized material design. However, although simulations excel at mapping an input material to its output property, their direct application to inverse design has traditionally been limited by their high computing cost and lack of differentiability. Here, taking the example of the inverse design of a porous matrix featuring targeted sorption isotherm, we introduce a computational inverse design framework that addresses these challenges, by programming differentiable simulation on TensorFlow platform that leverages automated end-to-end differentiation. Thanks to its differentiability, the simulation is used to directly train a deep generative model, which outputs an optimal porous matrix based on an arbitrary input sorption isotherm curve. Importantly, this inverse design pipeline leverages the power of tensor processing units (TPU)—an emerging family of dedicated chips, which, although they are specialized in deep learning, are flexible enough for intensive scientific simulations. This approach holds promise to accelerate inverse materials design. 
    more » « less
  5. Discrete-event simulation models generate random variates from input distributions and compute outputs according to the simulation logic. The input distributions are typically fitted to finite real-world data and thus are subject to estimation errors that can propagate to the simulation outputs: an issue commonly known as input uncertainty (IU). This paper investigates quantifying IU using the output confidence intervals (CIs) computed from bootstrap quantile estimators. The standard direct bootstrap method has overcoverage due to convolution of the simulation error and IU; however, the brute-force way of washing away the former is computationally demanding. We present two new bootstrap methods to enhance direct resampling in both statistical and computational efficiencies using shrinkage strategies to down-scale the variabilities encapsulated in the CIs. Our asymptotic analysis shows how both approaches produce tight CIs accounting for IU under limited input data and simulation effort along with the simulation sample-size requirements relative to the input data size. We demonstrate performances of the shrinkage strategies with several numerical experiments and investigate the conditions under which each method performs well. We also show advantages of nonparametric approaches over parametric bootstrap when the distribution family is misspecified and over metamodel approaches when the dimension of the distribution parameters is high. History: Accepted by Bruno Tuffin, Area Editor for Simulation. Funding: This work was supported by the National Science Foundation [CAREER CMMI-1834710, CAREER CMMI-2045400, DMS-1854659, and IIS-1849280]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.0044 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2022.0044 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ . 
    more » « less