Abstract The flexibility and wide applicability of the Fisher randomization test (FRT) make it an attractive tool for assessment of causal effects of interventions from modern-day randomized experiments that are increasing in size and complexity. This paper provides a theoretical inferential framework for FRT by establishing its connection with confidence distributions. Such a connection leads to development’s of (i) an unambiguous procedure for inversion of FRTs to generate confidence intervals with guaranteed coverage, (ii) new insights on the effect of size of the Monte Carlo sample on the estimation of a p-value curve and (iii) generic and specific methods to combine FRTs from multiple independent experiments with theoretical guarantees. Our developments pertain to finite sample settings but have direct extensions to large samples. Simulations and a case example demonstrate the benefit of these new developments.
more »
« less
How to produce confidence intervals instead of confidence tricks: Representative sampling for molecular simulations of fluid self-diffusion under nanoscale confinement
Ergodicity (or at least the tantalizing promise of it) is a core animating principle of molecular-dynamics (MD) simulations: Put simply, sample for long enough (in time), and you will make representative visits to states of a system all throughout phase space, consistent with the desired statistical ensemble. However, one is not guaranteed a priori that the chosen window of sampling in a production run is sufficiently long to avoid problematically non-ergodic observations; one is also not guaranteed that successive measurements of an observable are statistically independent of each other. In this paper, we investigate several particularly striking and troublesome examples of statistical correlations in MD simulations of nanoconfined fluids, which have profound implications on the quantification of uncertainty for transport phenomena in these systems. In particular, we show that these correlations can lead to confidence intervals on the fluid self-diffusion coefficient that are dramatically overconfident and estimates of this transport quantity that are simply inaccurate. We propose a simple approach—based on the thermally accelerated decorrelation of fluid positions and momenta—that ameliorates these issues and improves our confidence in MD measurements of nanoconfined fluid transport properties. We demonstrate that the formation of faithful confidence intervals for measurements of self-diffusion under nanoscale confinement typically requires at least 20 statistically independent samples, and potentially more depending on the sampling technique used.
more »
« less
- PAR ID:
- 10364117
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- The Journal of Chemical Physics
- Volume:
- 156
- Issue:
- 11
- ISSN:
- 0021-9606
- Page Range / eLocation ID:
- Article No. 114113
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We propose a bootstrap‐based calibrated projection procedure to build confidence intervals for single components and for smooth functions of a partially identified parameter vector in moment (in)equality models. The method controls asymptotic coverage uniformly over a large class of data generating processes. The extreme points of the calibrated projection confidence interval are obtained by extremizing the value of the function of interest subject to a proper relaxation of studentized sample analogs of the moment (in)equality conditions. The degree of relaxation, or critical level, is calibrated so that the function of θ , not θ itself, is uniformly asymptotically covered with prespecified probability. This calibration is based on repeatedly checking feasibility of linear programming problems, rendering it computationally attractive. Nonetheless, the program defining an extreme point of the confidence interval is generally nonlinear and potentially intricate. We provide an algorithm, based on the response surface method for global optimization, that approximates the solution rapidly and accurately, and we establish its rate of convergence. The algorithm is of independent interest for optimization problems with simple objectives and complicated constraints. An empirical application estimating an entry game illustrates the usefulness of the method. Monte Carlo simulations confirm the accuracy of the solution algorithm, the good statistical as well as computational performance of calibrated projection (including in comparison to other methods), and the algorithm's potential to greatly accelerate computation of other confidence intervals.more » « less
-
Construction of tight confidence sets and intervals is central to statistical inference and decision making. This paper develops new theory showing minimum average volume confidence sets for categorical data. More precisely, consider an empirical distribution pˆ generated from n iid realizations of a random variable that takes one of k possible values according to an unknown distribution p . This is analogous to a single draw from a multinomial distribution. A confidence set is a subset of the probability simplex that depends on pˆ and contains the unknown p with a specified confidence. This paper shows how one can construct minimum average volume confidence sets. The optimality of the sets translates to improved sample complexity for adaptive machine learning algorithms that rely on confidence sets, regions and intervals.more » « less
-
The process of data mining with differential privacy produces results that are affected by two types of noise: sampling noise due to data collection and privacy noise that is designed to prevent the reconstruction of sensitive information. In this paper, we consider the problem of designing confidence intervals for the parameters of a variety of differentially private machine learning models. The algorithms can provide confidence intervals that satisfy differential privacy (as well as the more recently proposed concentrated differential privacy) and can be used with existing differentially private mechanisms that train models using objective perturbation and output perturbation.more » « less
-
We show that in a variance component model, confidence intervals with asymptotically correct uniform coverage probability can be obtained by inverting certain test statistics based on the score for the restricted likelihood. The results hold in settings where the variance component is near or at the boundary of the parameter set. Simulations indicate that the proposed test statistics are approximately pivotal and lead to confidence intervals with near-nominal coverage even in small samples. We illustrate the application of the proposed methods in spatially resolved transcriptomics, where we compute approximately 15 000 confidence intervals, used for gene ranking, in less than 4 minutes. In the settings we consider, the proposed method is between two and 28 000 times faster than popular alternatives, depending on how many confidence intervals are computed.more » « less
An official website of the United States government
