skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 17 until 8:00 AM ET on Saturday, May 18 due to maintenance. We apologize for the inconvenience.

Title: Empirical Bayes Confidence Intervals Shrinking Both Means and Variances

We construct empirical Bayes intervals for a large number p of means. The existing intervals in the literature assume that variances σi2 are either equal or unequal but known. When the variances are unequal and unknown, the suggestion is typically to replace them by unbiased estimators Si2. However, when p is large, there would be advantage in ‘borrowing strength’ from each other. We derive double-shrinkage intervals for means on the basis of our empirical Bayes estimators that shrink both the means and the variances. Analytical and simulation studies and application to a real data set show that, compared with the t-intervals, our intervals have higher coverage probabilities while yielding shorter lengths on average. The double-shrinkage intervals are on average shorter than the intervals from shrinking the means alone and are always no longer than the intervals from shrinking the variances alone. Also, the intervals are explicitly defined and can be computed immediately.

more » « less
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Page Range / eLocation ID:
p. 265-285
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    Benjamini and Yekutieli suggested that it is important to account for multiplicity correction for confidence intervals when only some of the selected intervals are reported. They introduced the concept of the false coverage rate (FCR) for confidence intervals which is parallel to the concept of the false discovery rate in the multiple-hypothesis testing problem and they developed confidence intervals for selected parameters which control the FCR. Their approach requires the FCR to be controlled in the frequentist’s sense, i.e. controlled for all the possible unknown parameters. In modern applications, the number of parameters could be large, as large as tens of thousands or even more, as in microarray experiments. We propose a less conservative criterion, the Bayes FCR, and study confidence intervals controlling it for a class of distributions. The Bayes FCR refers to the average FCR with respect to a distribution of parameters. Under such a criterion, we propose some confidence intervals, which, by some analytic and numerical calculations, are demonstrated to have the Bayes FCR controlled at level q for a class of prior distributions, including mixtures of normal distributions and zero, where the mixing probability is unknown. The confidence intervals are shrinkage-type procedures which are more efficient for the θis that have a sparsity structure, which is a common feature of microarray data. More importantly, the centre of the proposed shrinkage intervals reduces much of the bias due to selection. Consequently, the proposed empirical Bayes intervals are always shorter in average length than the intervals of Benjamini and Yekutieli and can be only 50% or 60% as long in some cases. We apply these procedures to the data of Choe and colleagues and obtain similar results.

    more » « less
  2. We construct robust empirical Bayes confidence intervals (EBCIs) in a normal means problem. The intervals are centered at the usual linear empirical Bayes estimator, but use a critical value accounting for shrinkage. Parametric EBCIs that assume a normal distribution for the means (Morris (1983b)) may substantially undercover when this assumption is violated. In contrast, our EBCIs control coverage regardless of the means distribution, while remaining close in length to the parametric EBCIs when the means are indeed Gaussian. If the means are treated as fixed, our EBCIs have an average coverage guarantee: the coverage probability is at least 1 −  α on average across the n EBCIs for each of the means. Our empirical application considers the effects of U.S. neighborhoods on intergenerational mobility. 
    more » « less
  3. Multiple papers have studied the use of gene‐environment (GE) independence to enhance power for testing gene‐environment interaction in case‐control studies. However, studies that evaluate the role ofGEindependence in a meta‐analysis framework are limited. In this paper, we extend the single‐study empirical Bayes type shrinkage estimators proposed by Mukherjee and Chatterjee (2008) to a meta‐analysis setting that adjusts for uncertainty regarding the assumption ofGEindependence across studies. We use the retrospective likelihood framework to derive an adaptive combination of estimators obtained under the constrained model (assumingGEindependence) and unconstrained model (without assumptions ofGEindependence) with weights determined by measures ofGEassociation derived from multiple studies. Our simulation studies indicate that this newly proposed estimator has improved average performance across different simulation scenarios than the standard alternative of using inverse variance (covariance) weighted estimators that combines study‐specific constrained, unconstrained, or empirical Bayes estimators. The results are illustrated by meta‐analyzing 6 different studies of type 2 diabetes investigating interactions between genetic markers on the obesity relatedFTOgene and environmental factors body mass index and age.

    more » « less
  4. Summary

    We consider the problem of empirical Bayes estimation of multiple variances when provided with sample variances. Assuming an arbitrary prior on the variances, we derive different versions of the Bayes estimators using different loss functions. For one particular loss function, the resulting Bayes estimator relies on the marginal cumulative distribution function of the sample variances only. When replacing it with the empirical distribution function, we obtain an empirical Bayes version called the $F$-modelling-based empirical Bayes estimator of variances. We provide theoretical properties of this estimator, and further demonstrate its advantages through extensive simulations and real data analysis.

    more » « less
  5. Abstract

    Many large‐scale surveys collect both discrete and continuous variables. Small‐area estimates may be desired for means of continuous variables, proportions in each level of a categorical variable, or for domain means defined as the mean of the continuous variable for each level of the categorical variable. In this paper, we introduce a conditionally specified bivariate mixed‐effects model for small‐area estimation, and provide a necessary and sufficient condition under which the conditional distributions render a valid joint distribution. The conditional specification allows better model interpretation. We use the valid joint distribution to calculate empirical Bayes predictors and use the parametric bootstrap to estimate the mean squared error. Simulation studies demonstrate the superior performance of the bivariate mixed‐effects model relative to univariate model estimators. We apply the bivariate mixed‐effects model to construct estimates for small watersheds using data from the Conservation Effects Assessment Project, a survey developed to quantify the environmental impacts of conservation efforts. We construct predictors of mean sediment loss, the proportion of land where the soil loss tolerance is exceeded, and the average sediment loss on land where the soil loss tolerance is exceeded. In the data analysis, the bivariate mixed‐effects model leads to more scientifically interpretable estimates of domain means than those based on two independent univariate models.

    more » « less