Many twolevel nested simulation applications involve the conditional expectation of some response variable, where the expected response is the quantity of interest, and the expectation is with respect to the innerlevel random variables, conditioned on the outerlevel random variables. The latter typically represent random risk factors, and risk can be quantified by estimating the probability density function (pdf) or cumulative distribution function (cdf) of the conditional expectation. Much prior work has considered a naïve estimator that uses the empirical distribution of the sample averages across the innerlevel replicates. This results in a biased estimator, because the distribution of the sample averages is overdispersed relative to the distribution of the conditional expectation when the number of innerlevel replicates is finite. Whereas most prior work has focused on allocating the numbers of outer and innerlevel replicates to balance the bias/variance tradeoff, we develop a biascorrected pdf estimator. Our approach is based on the concept of density deconvolution, which is widely used to estimate densities with noisy observations but has not previously been considered for nested simulation problems. For a fixed computational budget, the biascorrected deconvolution estimator allows more outerlevel and fewer innerlevel replicates to be used, which substantially improves the efficiency ofmore »
On Estimation of Modal Decompositions
A modal decomposition is a useful tool that deconstructs the statistical dependence between two random variables by decomposing their joint distribution into orthogonal modes. Historically, modal decompositions have played important roles in statistics and information theory, e.g., in the study of maximal correlation. They are defined using the singular value decompo sitions of divergence transition matrices (DTMs) and conditional expectation operators corresponding to joint distributions. In this paper, we first characterize the set of all DTMs, and illustrate how the associated conditional expectation operators are the only weak contractions among a class of natural candidates. While modal decompositions have several modern machine learning applications, such as feature extraction from categorical data, the sample complexity of estimating them in such scenarios has not been analyzed. Hence, we also establish some nonasymptotic sample complexity results for the problem of estimating dominant modes of an unknown joint distribution from training data.
 Publication Date:
 NSFPAR ID:
 10172744
 Journal Name:
 IEEE International Symposium on Information Theory
 Volume:
 1
 Issue:
 1
 Page Range or eLocationID:
 15
 Sponsoring Org:
 National Science Foundation
More Like this


A conditional sampling oracle for a probability distribution D returns samples from the conditional distribution of D restricted to a specified subset of the domain. A recent line of work (Chakraborty et al. 2013 and Cannone et al. 2014) has shown that having access to such a conditional sampling oracle requires only polylogarithmic or even constant number of samples to solve distribution testing problems like identity and uniformity. This significantly improves over the standard sampling model where polynomially many samples are necessary. Inspired by these results, we introduce a computational model based on conditional sampling to develop sublinear algorithms with exponentially faster runtimes compared to standard sublinear algorithms. We focus on geometric optimization problems over points in high dimensional Euclidean space. Access to these points is provided via a conditional sampling oracle that takes as input a succinct representation of a subset of the domain and outputs a uniformly random point in that subset. We study two well studied problems: kmeans clustering and estimating the weight of the minimum spanning tree. In contrast to prior algorithms for the classic model, our algorithms have time, space and sample complexity that is polynomial in the dimension and polylogarithmic in the number ofmore »

To overcome the curse of dimensionality in joint probability learning, recent work has proposed to recover the joint probability mass function (PMF) of an arbitrary number of random variables (RVs) from threedimensional marginals, exploiting the uniqueness of tensor decomposition and the (unknown) dependence among the RVs. Nonetheless, accurately estimating threedimensional marginals is still costly in terms of sample complexity. Tensor decomposition also poses a computationally intensive optimization problem. This work puts forth a new framework that learns the joint PMF using pairwise marginals that are relatively easy to acquire. The method is built upon nonnegative matrix factorization (NMF) theory, and features a Gram–Schmidtlike economical algorithm that works provably well under realistic conditions. Theoretical analysis of a recently proposed expectation maximization (EM) algorithm for joint PMF recovery is also presented. In particular, the EM algorithm is shown to provably improve upon the proposed pairwise marginalbased approach. Synthetic and realdata experiments are employed to showcase the effectiveness of the proposed approach.

We consider the problem of characterizing shape populations using highly frequent representative shapes. Framing such shapes as statistical modes – shapes that correspond to (significant) local maxima of the underlying pdfs – we develop a frequencybased, nonparametric approach for estimating sample modes. Using an elastic shape metric, we define ϵneighborhoods in the shape space and shortlist shapes that are central and have the most neighbors. A critical issue – How to automatically select the threshold ϵ? – is resolved using a combination of ANOVA and empirical mode distribution. The resulting modal set, in turn, helps characterize the shape population and performs better than the traditional cluster means. We demonstrate this framework using amoeba shapes from brightfield microscopy images and highlight its advantages over existing ideas.

We introduce a framework for statistical estimation that leverages knowledge of how samples are collected but makes no distributional assumptions on the data values. Specifically, we consider a population of elements [n]={1,...,n} with corresponding data values x1,...,xn. We observe the values for a "sample" set A \subset [n] and wish to estimate some statistic of the values for a "target" set B \subset [n] where B could be the entire set. Crucially, we assume that the sets A and B are drawn according to some known distribution P over pairs of subsets of [n]. A given estimation algorithm is evaluated based on its "worstcase, expected error" where the expectation is with respect to the distribution P from which the sample A and target sets B are drawn, and the worstcase is with respect to the data values x1,...,xn. Within this framework, we give an efficient algorithm for estimating the target mean that returns a weighted combination of the sample values–where the weights are functions of the distribution P and the sample and target sets A, Band show that the worstcase expected error achieved by this algorithm is at most a multiplicative pi/2 factor worse than the optimal of such algorithms.more »