skip to main content


Title: Cauchy and other shrinkage priors for logistic regression in the presence of separation
Abstract

In recent years, the choice of prior distributions for Bayesian logistic regression has received considerable interest. It is widely acknowledged that noninformative, improper priors have to be used with caution because posterior propriety may not always hold. As an alternative, heavy‐tailed priors such as Cauchy prior distributions have been proposed by Gelman et al. (2008). The motivation for using Cauchy prior distributions is that they are proper, and thus, unlike noninformative priors, they are guaranteed to yield proper posterior distributions. The heavy tails of the Cauchy distribution allow the posterior distribution to adapt to the data. Thus Gelman et al. (2008) suggested the use of these prior distributions as a default weakly informative choice, in the absence of prior knowledge about the regression coefficients. While these prior distributions are guaranteed to have proper posterior distributions, Ghosh, Li, and Mitra (2018), showed that the posterior means may not exist, or can be unreasonably large, when they exist, for datasets with separation. In this paper, we provide a short review of the concept of separation, and we discuss how common prior distributions like the Cauchy and normal prior perform in data with separation. Theoretical and empirical results suggest that lighter tailed prior distributions such as normal prior distributions can be a good default choice when there is separation.

This article is categorized under:

Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods Statistical Models

 
more » « less
NSF-PAR ID:
10448844
Author(s) / Creator(s):
 
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
WIREs Computational Statistics
Volume:
11
Issue:
6
ISSN:
1939-5108
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Prior mathematical work of Constantin & Iyer ( Commun. Pure Appl. Maths , vol. 61, 2008, pp. 330–345; Ann. Appl. Probab. , vol. 21, 2011, pp. 1466–1492) has shown that incompressible Navier–Stokes solutions possess infinitely many stochastic Lagrangian conservation laws for vorticity, backward in time, which generalize the invariants of Cauchy ( Sciences mathématiques et physique , vol. I, 1815, pp. 33–73) for smooth Euler solutions. We reformulate this theory for the case of wall-bounded flows by appealing to the Kuz'min ( Phys. Lett. A , vol. 96, 1983, pp. 88–90)–Oseledets ( Russ. Math. Surv. , vol. 44, 1989, p. 210) representation of Navier–Stokes dynamics, in terms of the vortex-momentum density associated to a continuous distribution of infinitesimal vortex rings. The Constantin–Iyer theory provides an exact representation for vorticity at any interior point as an average over stochastic vorticity contributions transported from the wall. We point out relations of this Lagrangian formulation with the Eulerian theory of Lighthill (Boundary layer theory. In Laminar Boundary Layers (ed. L. Rosenhead), 1963, pp. 46–113)–Morton ( Geophys. Astrophys. Fluid Dyn. , vol. 28, 1984, pp. 277–308) for vorticity generation at solid walls, and also with a statistical result of Taylor ( Proc. R. Soc. Lond. A , vol. 135, 1932, pp. 685–702)–Huggins ( J. Low Temp. Phys. , vol. 96, 1994, pp. 317–346), which connects dissipative drag with organized cross-stream motion of vorticity and which is closely analogous to the ‘Josephson–Anderson relation’ for quantum superfluids. We elaborate a Monte Carlo numerical Lagrangian scheme to calculate the stochastic Cauchy invariants and their statistics, given the Eulerian space–time velocity field. The method is validated using an online database of a turbulent channel-flow simulation (Graham et al. , J. Turbul. , vol. 17, 2016, pp. 181–215), where conservation of the mean Cauchy invariant is verified for two selected buffer-layer events corresponding to an ‘ejection’ and a ‘sweep’. The variances of the stochastic Cauchy invariants grow exponentially backward in time, however, revealing Lagrangian chaos of the stochastic trajectories undergoing both fluid advection and viscous diffusion. 
    more » « less
  2. Abstract

    Structured population models are among the most widely used tools in ecology and evolution. Integral projection models (IPMs) use continuous representations of how survival, reproduction and growth change as functions of state variables such as size, requiring fewer parameters to be estimated than projection matrix models (PPMs). Yet, almost all published IPMs make an important assumption that size‐dependent growth transitions are or can be transformed to be normally distributed. In fact, many organisms exhibit highly skewed size transitions. Small individuals can grow more than they can shrink, and large individuals may often shrink more dramatically than they can grow. Yet, the implications of such skew for inference from IPMs has not been explored, nor have general methods been developed to incorporate skewed size transitions into IPMs, or deal with other aspects of real growth rates, including bounds on possible growth or shrinkage.

    Here, we develop a flexible approach to modelling skewed growth data using a modified beta regression model. We propose that sizes first be converted to a (0,1) interval by estimating size‐dependent minimum and maximum sizes through quantile regression. Transformed data can then be modelled using beta regression with widely available statistical tools. We demonstrate the utility of this approach using demographic data for a long‐lived plant, gorgonians and an epiphytic lichen. Specifically, we compare inferences of population parameters from discrete PPMs to those from IPMs that either assume normality or incorporate skew using beta regression or, alternatively, a skewed normal model.

    The beta and skewed normal distributions accurately capture the mean, variance and skew of real growth distributions. Incorporating skewed growth into IPMs decreases population growth and estimated life span relative to IPMs that assume normally distributed growth, and more closely approximate the parameters of PPMs that do not assume a particular growth distribution. A bounded distribution, such as the beta, also avoids the eviction problem caused by predicting some growth outside the modelled size range.

    Incorporating biologically relevant skew in growth data has important consequences for inference from IPMs. The approaches we outline here are flexible and easy to implement with existing statistical tools.

     
    more » « less
  3. Abstract

    Assessing the biological relevance of variance components estimated using Markov chain Monte Carlo (MCMC)‐based mixed‐effects models is not straightforward. Variance estimates are constrained to be greater than zero and their posterior distributions are often asymmetric. Different measures of central tendency for these distributions can therefore vary widely, and credible intervals cannot overlap zero, making it difficult to assess the size and statistical support for among‐group variance. Statistical support is often assessed through visual inspection of the whole posterior distribution and so relies on subjective decisions for interpretation.

    We use simulations to demonstrate the difficulties of summarizing the posterior distributions of variance estimates from MCMC‐based models. We then describe different methods for generating the expected null distribution (i.e. a distribution of effect sizes that would be obtained if there was no among‐group variance) that can be used to aid in the interpretation of variance estimates.

    Through comparing commonly used summary statistics of posterior distributions of variance components, we show that the posterior median is predominantly the least biased. We further show how null distributions can be used to derive ap‐value that provides complementary information to the commonly presented measures of central tendency and uncertainty. Finally, we show how thesep‐values facilitate the implementation of power analyses within an MCMC framework.

    The use of null distributions for variance components can aid study design and the interpretation of results from MCMC‐based models. We hope that this manuscript will make empiricists using mixed models think more carefully about their results, what descriptive statistics they present and what inference they can make.

     
    more » « less
  4. Abstract

    In this paper, we propose a sparse Bayesian procedure with global and local (GL) shrinkage priors for the problems of variable selection and classification in high‐dimensional logistic regression models. In particular, we consider two types of GL shrinkage priors for the regression coefficients, the horseshoe (HS) prior and the normal‐gamma (NG) prior, and then specify a correlated prior for the binary vector to distinguish models with the same size. The GL priors are then combined with mixture representations of logistic distribution to construct a hierarchical Bayes model that allows efficient implementation of a Markov chain Monte Carlo (MCMC) to generate samples from posterior distribution. We carry out simulations to compare the finite sample performances of the proposed Bayesian method with the existing Bayesian methods in terms of the accuracy of variable selection and prediction. Finally, two real‐data applications are provided for illustrative purposes.

     
    more » « less
  5. Abstract

    Ecologists use classifications of individuals in categories to understand composition of populations and communities. These categories might be defined by demographics, functional traits, or species. Assignment of categories is often imperfect, but frequently treated as observations without error. When individuals are observed but not classified, these “partial” observations must be modified to include the missing data mechanism to avoid spurious inference.

    We developed two hierarchical Bayesian models to overcome the assumption of perfect assignment to mutually exclusive categories in the multinomial distribution of categorical counts, when classifications are missing. These models incorporate auxiliary information to adjust the posterior distributions of the proportions of membership in categories. In one model, we use an empirical Bayes approach, where a subset of data from one year serves as a prior for the missing data the next. In the other approach, we use a small random sample of data within a year to inform the distribution of the missing data.

    We performed a simulation to show the bias that occurs when partial observations were ignored and demonstrated the altered inference for the estimation of demographic ratios. We applied our models to demographic classifications of elk (Cervus elaphus nelsoni) to demonstrate improved inference for the proportions of sex and stage classes.

    We developed multiple modeling approaches using a generalizable nested multinomial structure to account for partially observed data that were missing not at random for classification counts. Accounting for classification uncertainty is important to accurately understand the composition of populations and communities in ecological studies.

     
    more » « less