skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A nested error regression model with high-dimensional parameter for small area estimation
Abstract In this paper, we propose a flexible nested error regression small area model with high-dimensional parameter that incorporates heterogeneity in regression coefficients and variance components. We develop a new robust small area-specific estimating equations method that allows appropriate pooling of a large number of areas in estimating small area-specific model parameters. We propose a parametric bootstrap and jackknife method to estimate not only the mean squared errors but also other commonly used uncertainty measures such as standard errors and coefficients of variation. We conduct both model-based and design-based simulation experiments and real-life data analysis to evaluate the proposed methodology.  more » « less
Award ID(s):
1758808
PAR ID:
10396408
Author(s) / Creator(s):
;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Volume:
85
Issue:
2
ISSN:
1369-7412
Page Range / eLocation ID:
p. 212-239
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract In this article, we introduce a functional structural equation model for estimating directional relations from multivariate functional data. We decouple the estimation into two major steps: directional order determination and selection through sparse functional regression. We first propose a score function at the linear operator level, and show that its minimization can recover the true directional order when the relation between each function and its parental functions is nonlinear. We then develop a sparse functional additive regression, where both the response and the multivariate predictors are functions and the regression relation is additive and nonlinear. We also propose strategies to speed up the computation and scale up our method. In theory, we establish the consistencies of order determination, sparse functional additive regression, and directed acyclic graph estimation, while allowing both the dimension of the Karhunen–Loéve expansion coefficients and the number of random functions to diverge with the sample size. We illustrate the efficacy of our method through simulations, and an application to brain effective connectivity analysis. 
    more » « less
  2. Abstract Linear regression is arguably the most widely used statistical method. With fixed regressors and correlated errors, the conventional wisdom is to modify the variance-covariance estimator to accommodate the known correlation structure of the errors. We depart from existing literature by showing that with random regressors, linear regression inference is robust to correlated errors with unknown correlation structure. The existing theoretical analyses for linear regression are no longer valid because even the asymptotic normality of the least squares coefficients breaks down in this regime. We first prove the asymptotic normality of the t statistics by establishing their Berry–Esseen bounds based on a novel probabilistic analysis of self-normalized statistics. We then study the local power of the corresponding t tests and show that, perhaps surprisingly, error correlation can even enhance power in the regime of weak signals. Overall, our results show that linear regression is applicable more broadly than the conventional theory suggests, and they further demonstrate the value of randomization for ensuring robustness of inference. 
    more » « less
  3. Summary Computerised Record Linkage methods help us combine multiple data sets from different sources when a single data set with all necessary information is unavailable or when data collection on additional variables is time consuming and extremely costly. Linkage errors are inevitable in the linked data set because of the unavailability of error‐free unique identifiers. A small amount of linkage errors can lead to substantial bias and increased variability in estimating parameters of a statistical model. In this paper, we propose a unified theory for statistical analysis with linked data. Our proposed method, unlike the ones available for secondary data analysis of linked data, exploits record linkage process data as an alternative to taking a costly sample to evaluate error rates from the record linkage procedure. A jackknife method is introduced to estimate bias, covariance matrix and mean squared error of our proposed estimators. Simulation results are presented to evaluate the performance of the proposed estimators that account for linkage errors. 
    more » « less
  4. Subsampling is a practical strategy for analyzing vast survival data, which are progressively encountered across diverse research domains. While the optimal subsampling method has been applied to inferences for Cox models and parametric accelerated failure time (AFT) models, its application to semi‐parametric AFT models with rank‐based estimation have received limited attention. The challenges arise from the non‐smooth estimating function for regression coefficients and the seemingly zero contribution from censored observations in estimating functions in the commonly seen form. To address these challenges, we develop optimal subsampling probabilities for both event and censored observations by expressing the estimating functions through a well‐defined stochastic process. Meanwhile, we apply an induced smoothing procedure to the non‐smooth estimating functions. As the optimal subsampling probabilities depend on the unknown regression coefficients, we employ a two‐step procedure to obtain a feasible estimation method. An additional benefit of the method is its ability to resolve the issue of underestimation of the variance when the subsample size approaches the full sample size. We validate the performance of our estimators through a simulation study and apply the methods to analyze the survival time of lymphoma patients in the surveillance, epidemiology, and end results program. 
    more » « less
  5. Inbreeding depression can reduce the viability of wild populations. Detecting inbreeding depression in the wild is difficult; developing accurate estimates of inbreeding can be time and labor intensive. In this study, we used a two-step modeling procedure to incorporate uncertainty inherent in estimating individual inbreeding coefficients from multilocus genotypes into estimates of inbreeding depression in a population of Weddell seals (Leptonychotes weddellii). The two-step modeling procedure presented in this paper provides a method for estimating the magnitude of a known source of error, which is assumed absent in classic regression models, and incorporating this error into inferences about inbreeding depression. The method is essentially an errors-in-variables regression with non-normal errors in both the dependent and independent variables. These models, therefore, allow for a better evaluation of the uncertainty surrounding the biological importance of inbreeding depression in non-pedigreed wild populations. For this study we genotyped 154 adult female seals from the population in Erebus Bay, Antarctica, at 29 microsatellite loci, 12 of which are novel. We used a statistical evidence approach to inference rather than hypothesis testing because the discovery of both low and high levels of inbreeding are of scientific interest. We found evidence for an absence of inbreeding depression in lifetime reproductive success, adult survival, age at maturity, and the reproductive interval of female seals in this population. 
    more » « less