skip to main content

Title: Assessing Fit in Ordinal Factor Analysis Models: SRMR vs. RMSEA
This study introduces the statistical theory of using the Standardized Root Mean Squared Error (SRMR) to test close fit in ordinal factor analysis. We also compare the accuracy of confidence intervals (CIs) and tests of close fit based on the Standardized Root Mean Squared Error (SRMR) with those obtained based on the Root Mean Squared Error of Approximation (RMSEA). We use Unweighted Least Squares (ULS) estimation with a mean and variance corrected test statistic. The current (biased) implementation for the RMSEA never rejects that a model fits closely when data are binary and almost invariably rejects the model in large samples if data consist of five categories. The unbiased RMSEA produces better rejection rates, but it is only accurate enough when the number of variables is small (e.g., p = 10) and the degree of misfit is small. In contrast, across all simulated conditions, the tests of close fit based on the SRMR yield acceptable type I error rates. SRMR tests of close fit are also more powerful than those using the unbiased RMSEA.
; ;
Award ID(s):
Publication Date:
Journal Name:
Structural equation modeling
Sponsoring Org:
National Science Foundation
More Like this
  1. We examined the effect of estimation methods, maximum likelihood (ML), unweighted least squares (ULS), and diagonally weighted least squares (DWLS), on three population SEM (structural equation modeling) fit indices: the root mean square error of approximation (RMSEA), the comparative fit index (CFI), and the standardized root mean square residual (SRMR). We considered different types and levels of misspecification in factor analysis models: misspecified dimensionality, omitting cross-loadings, and ignoring residual correlations. Estimation methods had substantial impacts on the RMSEA and CFI so that different cutoff values need to be employed for different estimators. In contrast, SRMR is robust to the method used to estimate the model parameters. The same criterion can be applied at the population level when using the SRMR to evaluate model fit, regardless of the choice of estimation method.
  2. We examine the accuracy of p values obtained using the asymptotic mean and variance (MV) correction to the distribution of the sample standardized root mean squared residual (SRMR) proposed by Maydeu-Olivares to assess the exact fit of SEM models. In a simulation study, we found that under normality, the MV-corrected SRMR statistic provides reasonably accurate Type I errors even in small samples and for large models, clearly outperforming the current standard, that is, the likelihood ratio (LR) test. When data shows excess kurtosis, MV-corrected SRMR p values are only accurate in small models ( p = 10), or in medium-sized models ( p = 30) if no skewness is present and sample sizes are at least 500. Overall, when data are not normal, the MV-corrected LR test seems to outperform the MV-corrected SRMR. We elaborate on these findings by showing that the asymptotic approximation to the mean of the SRMR sampling distribution is quite accurate, while the asymptotic approximation to the standard deviation is not.
  3. Background

    Deep learning (DL)‐based automatic segmentation models can expedite manual segmentation yet require resource‐intensive fine‐tuning before deployment on new datasets. The generalizability of DL methods to new datasets without fine‐tuning is not well characterized.


    Evaluate the generalizability of DL‐based models by deploying pretrained models on independent datasets varying by MR scanner, acquisition parameters, and subject population.

    Study Type

    Retrospective based on prospectively acquired data.


    Overall test dataset: 59 subjects (26 females); Study 1: 5 healthy subjects (zero females), Study 2: 8 healthy subjects (eight females), Study 3: 10 subjects with osteoarthritis (eight females), Study 4: 36 subjects with various knee pathology (10 females).

    Field Strength/Sequence

    A 3‐T, quantitative double‐echo steady state (qDESS).


    Four annotators manually segmented knee cartilage. Each reader segmented one of four qDESS datasets in the test dataset. Two DL models, one trained on qDESS data and another on Osteoarthritis Initiative (OAI)‐DESS data, were assessed. Manual and automatic segmentations were compared by quantifying variations in segmentation accuracy, volume, and T2 relaxation times for superficial and deep cartilage.

    Statistical Tests

    Dice similarity coefficient (DSC) for segmentation accuracy. Lin's concordance correlation coefficient (CCC), Wilcoxon rank‐sum tests, root‐mean‐squared error‐coefficient‐of‐variation to quantify manual vs. automatic T2 and volume variations. Bland–Altman plots for manual vs. automatic T2more »agreement. APvalue < 0.05 was considered statistically significant.


    DSCs for the qDESS‐trained model, 0.79–0.93, were higher than those for the OAI‐DESS‐trained model, 0.59–0.79. T2 and volume CCCs for the qDESS‐trained model, 0.75–0.98 and 0.47–0.95, were higher than respective CCCs for the OAI‐DESS‐trained model, 0.35–0.90 and 0.13–0.84. Bland–Altman 95% limits of agreement for superficial and deep cartilage T2 were lower for the qDESS‐trained model, ±2.4 msec and ±4.0 msec, than the OAI‐DESS‐trained model, ±4.4 msec and ±5.2 msec.

    Data Conclusion

    The qDESS‐trained model may generalize well to independent qDESS datasets regardless of MR scanner, acquisition parameters, and subject population.

    Evidence Level


    Technical Efficacy

    Stage 1

    « less
  4. Purpose

    To improve the performance of neural networks for parameter estimation in quantitative MRI, in particular when the noise propagation varies throughout the space of biophysical parameters.

    Theory and Methods

    A theoretically well‐founded loss function is proposed that normalizes the squared error of each estimate with respective Cramér–Rao bound (CRB)—a theoretical lower bound for the variance of an unbiased estimator. This avoids a dominance of hard‐to‐estimate parameters and areas in parameter space, which are often of little interest. The normalization with corresponding CRB balances the large errors of fundamentally more noisy estimates and the small errors of fundamentally less noisy estimates, allowing the network to better learn to estimate the latter. Further, proposed loss function provides an absolute evaluation metric for performance: A network has an average loss of 1 if it is a maximally efficient unbiased estimator, which can be considered the ideal performance. The performance gain with proposed loss function is demonstrated at the example of an eight‐parameter magnetization transfer model that is fitted to phantom and in vivo data.


    Networks trained with proposed loss function perform close to optimal, that is, their loss converges to approximately 1, and their performance is superior to networks trained with the standard mean‐squaredmore »error (MSE). The proposed loss function reduces the bias of the estimates compared to the MSE loss, and improves the match of the noise variance to the CRB. This performance gain translates to in vivo maps that align better with the literature.


    Normalizing the squared error with the CRB during the training of neural networks improves their performance in estimating biophysical parameters.

    « less
  5. Purpose

    To develop a scan‐specific model that estimates and corrects k‐space errors made when reconstructing accelerated MRI data.


    Scan‐specific artifact reduction in k‐space (SPARK) trains a convolutional‐neural‐network to estimate and correct k‐space errors made by an input reconstruction technique by back‐propagating from the mean‐squared‐error loss between an auto‐calibration signal (ACS) and the input technique’s reconstructed ACS. First, SPARK is applied to generalized autocalibrating partially parallel acquisitions (GRAPPA) and demonstrates improved robustness over other scan‐specific models, such as robust artificial‐neural‐networks for k‐space interpolation (RAKI) and residual‐RAKI. Subsequent experiments demonstrate that SPARK synergizes with residual‐RAKI to improve reconstruction performance. SPARK also improves reconstruction quality when applied to advanced acquisition and reconstruction techniques like 2D virtual coil (VC‐) GRAPPA, 2D LORAKS, 3D GRAPPA without an integrated ACS region, and 2D/3D wave‐encoded imaging.


    SPARK yields SSIM improvement and 1.5 – 2× root mean squared error (RMSE) reduction when applied to GRAPPA and improves robustness to ACS size for various acceleration rates in comparison to other scan‐specific techniques. When applied to advanced reconstruction techniques such as residual‐RAKI, 2D VC‐GRAPPA and LORAKS, SPARK achieves up to 20% RMSE improvement. SPARK with 3D GRAPPA also improves RMSE performance by ~2×, SSIM performance, and perceived image quality without amore »fully sampled ACS region. Finally, SPARK synergizes with non‐Cartesian, 2D and 3D wave‐encoding imaging by reducing RMSE between 20% and 25% and providing qualitative improvements.


    SPARK synergizes with physics‐based acquisition and reconstruction techniques to improve accelerated MRI by training scan‐specific models to estimate and correct reconstruction errors in k‐space.

    « less