skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The HDI + ROPE decision rule is logically incoherent but we can fix it.
The Bayesian HDI+ROPE decision rule is an increasingly common approach to testing null parameter values. The decision procedure involves a comparison between a posterior highest density interval (HDI) and a pre-specified region of practical equivalence (ROPE). One then accepts or rejects the null parameter value depending on the overlap (or lack thereof) between these intervals. Here we demonstrate, both theoretically and through examples, that this procedure is logically incoherent. Because the HDI is not transformation invariant, the ultimate inferential decision depends on statistically arbitrary and scientifically irrelevant properties of the statistical model. The incoherence arises from a common confusion between probability density and probability proper. The HDI+ROPE procedure relies on characterizing posterior densities as opposed to being based directly on probability. We conclude with recommendations for alternative Bayesian testing procedures that do not exhibit this pathology and provide a "quick fix" in the form of quantile intervals.  more » « less
Award ID(s):
2051186
PAR ID:
10562001
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
American Psychological Association
Date Published:
Journal Name:
Psychological Methods
ISSN:
1082-989X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Sequential ranking-and-selection procedures deliver Bayesian guarantees by repeatedly computing a posterior quantity of interest—for example, the posterior probability of good selection or the posterior expected opportunity cost—and terminating when this quantity crosses some threshold. Computing these posterior quantities entails nontrivial numerical computation. Thus, rather than exactly check such posterior-based stopping rules, it is common practice to use cheaply computable bounds on the posterior quantity of interest, for example, those based on Bonferroni’s or Slepian’s inequalities. The result is a conservative procedure that samples more simulation replications than are necessary. We explore how the time spent simulating these additional replications might be better spent computing the posterior quantity of interest via numerical integration, with the potential for terminating the procedure sooner. To this end, we develop several methods for improving the computational efficiency of exactly checking the stopping rules. Simulation experiments demonstrate that the proposed methods can, in some instances, significantly reduce a procedure’s total sample size. We further show these savings can be attained with little added computational effort by making effective use of a Monte Carlo estimate of the posterior quantity of interest. Summary of Contribution: The widespread use of commercial simulation software in industry has made ranking-and-selection (R&S) algorithms an accessible simulation-optimization tool for operations research practitioners. This paper addresses computational aspects of R&S procedures delivering finite-time Bayesian statistical guarantees, primarily the decision of when to terminate sampling. Checking stopping rules entails computing or approximating posterior quantities of interest perceived as being computationally intensive to evaluate. The main results of this paper show that these quantities can be efficiently computed via numerical integration and can yield substantial savings in sampling relative to the prevailing approach of using conservative bounds. In addition to enhancing the performance of Bayesian R&S procedures, the results have the potential to advance other research in this space, including the development of more efficient allocation rules. 
    more » « less
  2. Su, Bing (Ed.)
    Abstract Confidence intervals (CIs) depict the statistical uncertainty surrounding evolutionary divergence time estimates. They capture variance contributed by the finite number of sequences and sites used in the alignment, deviations of evolutionary rates from a strict molecular clock in a phylogeny, and uncertainty associated with clock calibrations. Reliable tests of biological hypotheses demand reliable CIs. However, current non-Bayesian methods may produce unreliable CIs because they do not incorporate rate variation among lineages and interactions among clock calibrations properly. Here, we present a new analytical method to calculate CIs of divergence times estimated using the RelTime method, along with an approach to utilize multiple calibration uncertainty densities in dating analyses. Empirical data analyses showed that the new methods produce CIs that overlap with Bayesian highest posterior density intervals. In the analysis of computer-simulated data, we found that RelTime CIs show excellent average coverage probabilities, that is, the actual time is contained within the CIs with a 94% probability. These developments will encourage broader use of computationally efficient RelTime approaches in molecular dating analyses and biological hypothesis testing. 
    more » « less
  3. Driven by steady progress in deep generative modeling, simulation-based inference (SBI) has emerged as the workhorse for inferring the parameters of stochastic simulators. However, recent work has demonstrated that model misspecification can compromise the reliability of SBI, preventing its adoption in important applications where only misspecified simulators are available. This work introduces robust posterior estimation~(RoPE), a framework that overcomes model misspecification with a small real-world calibration set of ground-truth parameter measurements. We formalize the misspecification gap as the solution of an optimal transport~(OT) problem between learned representations of real-world and simulated observations, allowing RoPE to learn a model of the misspecification without placing additional assumptions on its nature. RoPE demonstrates how OT and a calibration set provide a controllable balance between calibrated uncertainty and informative inference, even under severely misspecified simulators. Results on four synthetic tasks and two real-world problems with ground-truth labels demonstrate that RoPE outperforms baselines and consistently returns informative and calibrated credible intervals. 
    more » « less
  4. Data-driven Deep Learning (DL) models have revolutionized autonomous systems, but ensuring their safety and reliability necessitates the assessment of predictive confidence or uncertainty. Bayesian DL provides a principled approach to quantify uncertainty via probability density functions defined over model parameters. However, the exact solution is intractable for most DL models, and the approximation methods, often based on heuristics, suffer from scalability issues and stringent distribution assumptions and may lack theoretical guarantees. This work develops a Sequential Importance Sampling framework that approximates the posterior probability density function through weighted samples (or particles), which can be used to find the mean, variance, or higher-order moments of the posterior distribution. We demonstrate that propagating particles, which capture information about the higher-order moments, through the layers of the DL model results in increased robustness to natural and malicious noise (adversarial attacks). The variance computed from these particles effectively quantifies the model’s decision uncertainty, demonstrating well-calibrated and accurate predictive confidence. 
    more » « less
  5. Density estimation is a building block for many other statistical methods, such as classification, nonparametric testing, and data compression. In this paper, we focus on a nonparametric approach to multivariate density estimation, and study its asymptotic properties under both frequentist and Bayesian settings. The estimated density function is obtained by considering a sequence of approximating spaces to the space of densities. These spaces consist of piecewise constant density functions supported by binary partitions with increasing complexity. To obtain an estimate, the partition is learned by maximizing either the likelihood of the corresponding histogram on that partition, or the marginal posterior probability of the partition under a suitable prior. We analyze the convergence rate of the maximum likelihood estimator and the posterior concentration rate of the Bayesian estimator, and conclude that for a relatively rich class of density functions the rate does not directly depend on the dimension. We also show that the Bayesian method can adapt to the unknown smoothness of the density function. The method is applied to several specific function classes and explicit rates are obtained. These include spatially sparse functions, functions of bounded variation, and Holder continuous functions. We also introduce an ensemble approach, obtained by aggregating multiple density estimates fit under carefully designed perturbations, and show that for density functions lying in a Holder space (H^(1,β), 0 < β ≤ 1), the ensemble method can achieve minimax convergence rate up to a logarithmic term, while the corresponding rate of the density estimator based on a single partition is suboptimal for this function class. 
    more » « less