skip to main content


Title: Using the Standardized Root Mean Squared Residual (SRMR) to Assess Exact Fit in Structural Equation Models
We examine the accuracy of p values obtained using the asymptotic mean and variance (MV) correction to the distribution of the sample standardized root mean squared residual (SRMR) proposed by Maydeu-Olivares to assess the exact fit of SEM models. In a simulation study, we found that under normality, the MV-corrected SRMR statistic provides reasonably accurate Type I errors even in small samples and for large models, clearly outperforming the current standard, that is, the likelihood ratio (LR) test. When data shows excess kurtosis, MV-corrected SRMR p values are only accurate in small models ( p = 10), or in medium-sized models ( p = 30) if no skewness is present and sample sizes are at least 500. Overall, when data are not normal, the MV-corrected LR test seems to outperform the MV-corrected SRMR. We elaborate on these findings by showing that the asymptotic approximation to the mean of the SRMR sampling distribution is quite accurate, while the asymptotic approximation to the standard deviation is not.  more » « less
Award ID(s):
1659936
NSF-PAR ID:
10300396
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Educational and Psychological Measurement
Volume:
81
Issue:
1
ISSN:
0013-1644
Page Range / eLocation ID:
110 to 130
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This study introduces the statistical theory of using the Standardized Root Mean Squared Error (SRMR) to test close fit in ordinal factor analysis. We also compare the accuracy of confidence intervals (CIs) and tests of close fit based on the Standardized Root Mean Squared Error (SRMR) with those obtained based on the Root Mean Squared Error of Approximation (RMSEA). We use Unweighted Least Squares (ULS) estimation with a mean and variance corrected test statistic. The current (biased) implementation for the RMSEA never rejects that a model fits closely when data are binary and almost invariably rejects the model in large samples if data consist of five categories. The unbiased RMSEA produces better rejection rates, but it is only accurate enough when the number of variables is small (e.g., p = 10) and the degree of misfit is small. In contrast, across all simulated conditions, the tests of close fit based on the SRMR yield acceptable type I error rates. SRMR tests of close fit are also more powerful than those using the unbiased RMSEA. 
    more » « less
  2. null (Ed.)
    Summary This paper is concerned with empirical likelihood inference on the population mean when the dimension $p$ and the sample size $n$ satisfy $p/n\rightarrow c\in [1,\infty)$. As shown in Tsao (2004), the empirical likelihood method fails with high probability when $p/n>1/2$ because the convex hull of the $n$ observations in $\mathbb{R}^p$ becomes too small to cover the true mean value. Moreover, when $p> n$, the sample covariance matrix becomes singular, and this results in the breakdown of the first sandwich approximation for the log empirical likelihood ratio. To deal with these two challenges, we propose a new strategy of adding two artificial data points to the observed data. We establish the asymptotic normality of the proposed empirical likelihood ratio test. The proposed test statistic does not involve the inverse of the sample covariance matrix. Furthermore, its form is explicit, so the test can easily be carried out with low computational cost. Our numerical comparison shows that the proposed test outperforms some existing tests for high-dimensional mean vectors in terms of power. We also illustrate the proposed procedure with an empirical analysis of stock data. 
    more » « less
  3. Mateu, Jorge (Ed.)
    When dealing with very high-dimensional and functional data, rank deficiency of sample covariance matrix often complicates the tests for population mean. To alleviate this rank deficiency problem, Munk et al. (J Multivar Anal 99:815–833, 2008) proposed neighborhood hypothesis testing procedure that tests whether the population mean is within a small, pre-specified neighborhood of a known quantity, M. How could we objectively specify a reasonable neighborhood, particularly when the sample space is unbounded? What should be the size of the neighborhood? In this article, we develop the modified neighborhood hypothesis testing framework to answer these two questions.We define the neighborhood as a proportion of the total amount of variation present in the population of functions under study and proceed to derive the asymptotic null distribution of the appropriate test statistic. Power analyses suggest that our approach is appropriate when sample space is unbounded and is robust against error structures with nonzero mean. We then apply this framework to assess whether the near-default sigmoidal specification of dose-response curves is adequate for widely used CCLE database. Results suggest that our methodology could be used as a pre-processing step before using conventional efficacy metrics, obtained from sigmoid models (for example: IC50 or AUC), as downstream predictive targets. 
    more » « less
  4. Abstract Obtaining lightweight and accurate approximations of discretized objective functional Hessians in inverse problems governed by partial differential equations (PDEs) is essential to make both deterministic and Bayesian statistical large-scale inverse problems computationally tractable. The cubic computational complexity of dense linear algebraic tasks, such as Cholesky factorization, that provide a means to sample Gaussian distributions and determine solutions of Newton linear systems is a computational bottleneck at large-scale. These tasks can be reduced to log-linear complexity by utilizing hierarchical off-diagonal low-rank (HODLR) matrix approximations. In this work, we show that a class of Hessians that arise from inverse problems governed by PDEs are well approximated by the HODLR matrix format. In particular, we study inverse problems governed by PDEs that model the instantaneous viscous flow of ice sheets. In these problems, we seek a spatially distributed basal sliding parameter field such that the flow predicted by the ice sheet model is consistent with ice sheet surface velocity observations. We demonstrate the use of HODLR Hessian approximation to efficiently sample the Laplace approximation of the posterior distribution with covariance further approximated by HODLR matrix compression. Computational studies are performed which illustrate ice sheet problem regimes for which the Gauss–Newton data-misfit Hessian is more efficiently approximated by the HODLR matrix format than the low-rank (LR) format. We then demonstrate that HODLR approximations can be favorable, when compared to global LR approximations, for large-scale problems by studying the data-misfit Hessian associated with inverse problems governed by the first-order Stokes flow model on the Humboldt glacier and Greenland ice sheet. 
    more » « less
  5. Summary

    We introduce an L2-type test for testing mutual independence and banded dependence structure for high dimensional data. The test is constructed on the basis of the pairwise distance covariance and it accounts for the non-linear and non-monotone dependences among the data, which cannot be fully captured by the existing tests based on either Pearson correlation or rank correlation. Our test can be conveniently implemented in practice as the limiting null distribution of the test statistic is shown to be standard normal. It exhibits excellent finite sample performance in our simulation studies even when the sample size is small albeit the dimension is high and is shown to identify non-linear dependence in empirical data analysis successfully. On the theory side, asymptotic normality of our test statistic is shown under quite mild moment assumptions and with little restriction on the growth rate of the dimension as a function of sample size. As a demonstration of good power properties for our distance-covariance-based test, we further show that an infeasible version of our test statistic has the rate optimality in the class of Gaussian distributions with equal correlation.

     
    more » « less