skip to main content

Title: Bias-corrected Estimation of the Density of a Conditional Expectation in Nested Simulation Problems
Many two-level nested simulation applications involve the conditional expectation of some response variable, where the expected response is the quantity of interest, and the expectation is with respect to the inner-level random variables, conditioned on the outer-level random variables. The latter typically represent random risk factors, and risk can be quantified by estimating the probability density function (pdf) or cumulative distribution function (cdf) of the conditional expectation. Much prior work has considered a naïve estimator that uses the empirical distribution of the sample averages across the inner-level replicates. This results in a biased estimator, because the distribution of the sample averages is over-dispersed relative to the distribution of the conditional expectation when the number of inner-level replicates is finite. Whereas most prior work has focused on allocating the numbers of outer- and inner-level replicates to balance the bias/variance tradeoff, we develop a bias-corrected pdf estimator. Our approach is based on the concept of density deconvolution, which is widely used to estimate densities with noisy observations but has not previously been considered for nested simulation problems. For a fixed computational budget, the bias-corrected deconvolution estimator allows more outer-level and fewer inner-level replicates to be used, which substantially improves the efficiency of more » the nested simulation. « less
Authors:
; ; ; ;
Award ID(s):
1814840
Publication Date:
NSF-PAR ID:
10336403
Journal Name:
ACM Transactions on Modeling and Computer Simulation
Volume:
31
Issue:
4
Page Range or eLocation-ID:
1 to 36
ISSN:
1049-3301
Sponsoring Org:
National Science Foundation
More Like this
  1. In a chance constrained program (CCP), decision makers seek the best decision whose probability of violating the uncertainty constraints is within the prespecified risk level. As a CCP is often nonconvex and is difficult to solve to optimality, much effort has been devoted to developing convex inner approximations for a CCP, among which the conditional value-at-risk (CVaR) has been known to be the best for more than a decade. This paper studies and generalizes the ALSO-X, originally proposed by Ahmed, Luedtke, SOng, and Xie in 2017 , for solving a CCP. We first show that the ALSO-X resembles a bilevel optimization, where the upper-level problem is to find the best objective function value and enforce the feasibility of a CCP for a given decision from the lower-level problem, and the lower-level problem is to minimize the expectation of constraint violations subject to the upper bound of the objective function value provided by the upper-level problem. This interpretation motivates us to prove that when uncertain constraints are convex in the decision variables, ALSO-X always outperforms the CVaR approximation. We further show (i) sufficient conditions under which ALSO-X can recover an optimal solution to a CCP; (ii) an equivalent bilinear programming formulationmore »of a CCP, inspiring us to enhance ALSO-X with a convergent alternating minimization method (ALSO-X+); and (iii) an extension of ALSO-X and ALSO-X+ to distributionally robust chance constrained programs (DRCCPs) under the ∞−Wasserstein ambiguity set. Our numerical study demonstrates the effectiveness of the proposed methods.« less
  2. This paper provides a general derivative identity for the conditional mean estimator of an arbitrary vector signal in Gaussian noise with an arbitrary covariance matrix. This new identity is used to recover and generalize many known identities in the literature and derive some new identities. For example, a new identity is discovered, which shows that an arbitrary higher-order conditional moment is completely determined by the first conditional moment.Several applications of the identities are shown. For instance, by using one of the identities, a simple proof of the uniqueness of the conditional mean estimator as a function of the distribution of the signal is shown. Moreover, one of the identities is used to extend the notion of empirical Bayes to higher-order conditional moments. Specifically, based on a random sample of noisy observations, a consistent estimator for a conditional expectation of any order is derived.
  3. Abstract

    We consider estimation of the density of a multivariate response, that is not observed directly but only through measurements contaminated by additive error. Our focus is on the realistic sampling case of bivariate panel data (repeated contaminated bivariate measurements on each sample unit) with an unknown error distribution. Several factors can affect the performance of kernel deconvolution density estimators, including the choice of the kernel and the estimation approach of the unknown error distribution. As the choice of the kernel function is critically important, the class of flat-top kernels can have advantages over more commonly implemented alternatives. We describe different approaches for density estimation with multivariate panel responses, and investigate their performance through simulation. We examine competing kernel functions and describe a flat-top kernel that has not been used in deconvolution problems. Moreover, we study several nonparametric options for estimating the unknown error distribution. Finally, we also provide guidelines to the numerical implementation of kernel deconvolution in higher sampling dimensions.

  4. We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one-dimensional entropy from equiprobable random samples, and compare it with the popular Bin-Counting (BC) and Kernel Density (KD) methods. In contrast to BC, which uses equal-width bins with varying probability mass, the QS method uses estimates of the quantiles that divide the support of the data generating probability density function (pdf) into equal-probability-mass intervals. And, whereas BC and KD each require optimal tuning of a hyper-parameter whose value varies with sample size and shape of the pdf, QS only requires specification of the number of quantiles to be used. Results indicate, for the class of distributions tested, that the optimal number of quantiles is a fixed fraction of the sample size (empirically determined to be ~0.25–0.35), and that this value is relatively insensitive to distributional form or sample size. This provides a clear advantage over BC and KD since hyper-parameter tuning is not required. Further, unlike KD, there is no need to select an appropriate kernel-type, and so QS is applicable to pdfs of arbitrary shape, including those with discontinuous slope and/or magnitude. Bootstrapping is used to approximate the sampling variability distribution of the resulting entropy estimate,more »and is shown to accurately reflect the true uncertainty. For the four distributional forms studied (Gaussian, Log-Normal, Exponential and Bimodal Gaussian Mixture), expected estimation bias is less than 1% and uncertainty is low even for samples of as few as 100 data points; in contrast, for KD the small sample bias can be as large as −10% and for BC as large as −50%. We speculate that estimating quantile locations, rather than bin-probabilities, results in more efficient use of the information in the data to approximate the underlying shape of an unknown data generating pdf.« less
  5. Multilevel structural equation models (MSEMs) are well suited for educational research because they accommodate complex systems involving latent variables in multilevel settings. Estimation using Croon’s bias-corrected factor score (BCFS) path estimation has recently been extended to MSEMs and demonstrated promise with limited sample sizes. This makes it well suited for planned educational research which often involves sample sizes constrained by logistical and financial factors. However, the performance of BCFS estimation with MSEMs has yet to be thoroughly explored under common but difficult conditions including in the presence of non-normal indicators and model misspecifications. We conducted two simulation studies to evaluate the accuracy and efficiency of the estimator under these conditions. Results suggest that BCFS estimation of MSEMs is often more dependable, more efficient, and less biased than other estimation approaches when sample sizes are limited or model misspecifications are present but is more susceptible to indicator non-normality. These results support, supplement, and elucidate previous literature describing the effective performance of BCFS estimation encouraging its utilization as an alternative or supplemental estimator for MSEMs.