Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract We derive confidence intervals (CIs) and confidence sequences (CSs) for the classical problem of estimating a bounded mean. Our approach generalizes and improves on the celebrated Chernoff method, yielding the best closed-form empirical-Bernstein CSs and CIs (converging exactly to the oracle Bernstein width) as well as non-closed-form betting CSs and CIs. Our method combines new composite nonnegative (super)martingales with Ville's maximal inequality, with strong connections to testing by betting and the method of mixtures. We also show how these ideas can be extended to sampling without replacement. In all cases, our bounds are adaptive to the unknown variance, and empirically vastly outperform prior approaches, establishing a new state-of-the-art for four fundamental problems: CSs and CIs for bounded means, when sampling with and without replacement.more » « less
-
We address uncertainty quantification for Gaussian processes (GPs) under misspecified priors, with an eye towards Bayesian Optimization (BO). GPs are widely used in BO because they easily enable exploration based on posterior uncertainty bands. However, this convenience comes at the cost of robustness: a typical function encountered in practice is unlikely to have been drawn from the data scientist’s prior, in which case uncertainty estimates can be misleading, and the resulting exploration can be suboptimal. We present a frequentist approach to GP/BO uncertainty quantification. We utilize the GP framework as a working model, but do not assume correctness of the prior. We instead construct a \emph{confidence sequence} (CS) for the unknown function using martingale techniques. There is a necessary cost to achieving robustness: if the prior was correct, posterior GP bands are narrower than our CS. Nevertheless, when the prior is wrong, our CS is statistically valid and empirically outperforms standard GP methods, in terms of both coverage and utility for BO. Additionally, we demonstrate that powered likelihoods provide robustness against model misspecification.more » « less
-
Many practical tasks involve sampling sequentially without replacement (WoR) from a finite population of size $$N$$, in an attempt to estimate some parameter $$\theta^\star$$. Accurately quantifying uncertainty throughout this process is a nontrivial task, but is necessary because it often determines when we stop collecting samples and confidently report a result. We present a suite of tools for designing \textit{confidence sequences} (CS) for $$\theta^\star$$. A CS is a sequence of confidence sets $$(C_n)_{n=1}^N$$, that shrink in size, and all contain $$\theta^\star$$ simultaneously with high probability. We present a generic approach to constructing a frequentist CS using Bayesian tools, based on the fact that the ratio of a prior to the posterior at the ground truth is a martingale. We then present Hoeffding- and empirical-Bernstein-type time-uniform CSs and fixed-time confidence intervals for sampling WoR, which improve on previous bounds in the literature and explicitly quantify the benefit of WoR sampling.more » « less
-
We propose a general method for constructing confidence sets and hypothesis tests that have finite-sample guarantees without regularity conditions. We refer to such procedures as “universal.” The method is very simple and is based on a modified version of the usual likelihood-ratio statistic that we call “the split likelihood-ratio test” (split LRT) statistic. The (limiting) null distribution of the classical likelihood-ratio statistic is often intractable when used to test composite null hypotheses in irregular statistical models. Our method is especially appealing for statistical inference in these complex setups. The method we suggest works for any parametric model and also for some nonparametric models, as long as computing a maximum-likelihood estimator (MLE) is feasible under the null. Canonical examples arise in mixture modeling and shape-constrained inference, for which constructing tests and confidence sets has been notoriously difficult. We also develop various extensions of our basic methods. We show that in settings when computing the MLE is hard, for the purpose of constructing valid tests and intervals, it is sufficient to upper bound the maximum likelihood. We investigate some conditions under which our methods yield valid inferences under model misspecification. Further, the split LRT can be used with profile likelihoods to deal with nuisance parameters, and it can also be run sequentially to yield anytime-valid P values and confidence sequences. Finally, when combined with the method of sieves, it can be used to perform model selection with nested model classes.more » « less