Accurate item parameters and standard errors (SEs) are crucial for many multidimensional item response theory (MIRT) applications. A recent study proposed the Gaussian Variational Expectation Maximization (GVEM) algorithm to improve computational efficiency and estimation accuracy ( Cho et al., 2021 ). However, the SE estimation procedure has yet to be fully addressed. To tackle this issue, the present study proposed an updated supplemented expectation maximization (USEM) method and a bootstrap method for SE estimation. These two methods were compared in terms of SE recovery accuracy. The simulation results demonstrated that the GVEM algorithm with bootstrap and item priors (GVEM-BSP) outperformed the other methods, exhibiting less bias and relative bias for SE estimates under most conditions. Although the GVEM with USEM (GVEM-USEM) was the most computationally efficient method, it yielded an upward bias for SE estimates.
more »
« less
MSAEM Estimation for Confirmatory Multidimensional Four‐Parameter Normal Ogive Models
Abstract In this paper, we develop a mixed stochastic approximation expectation‐maximization (MSAEM) algorithm coupled with a Gibbs sampler to compute the marginalized maximum a posteriori estimate (MMAPE) of a confirmatory multidimensional four‐parameter normal ogive (M4PNO) model. The proposed MSAEM algorithm not only has the computational advantages of the stochastic approximation expectation‐maximization (SAEM) algorithm for multidimensional data, but it also alleviates the potential instability caused by label‐switching, and then improved the estimation accuracy. Simulation studies are conducted to illustrate the good performance of the proposed MSAEM method, where MSAEM consistently performs better than SAEM and some other existing methods in multidimensional item response theory. Moreover, the proposed method is applied to a real data set from the 2018 Programme for International Student Assessment (PISA) to demonstrate the usefulness of the 4PNO model as well as MSAEM in practice.
more »
« less
- PAR ID:
- 10469054
- Publisher / Repository:
- Wiley-Blackwell
- Date Published:
- Journal Name:
- Journal of Educational Measurement
- ISSN:
- 0022-0655
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Data harmonization is an emerging approach to strategically combining data from multiple independent studies, enabling addressing new research questions that are not answerable by a single contributing study. A fundamental psychometric challenge for data harmonization is to create commensurate measures for the constructs of interest across studies. In this study, we focus on a regularized explanatory multidimensional item response theory model (re-MIRT) for establishing measurement equivalence across instruments and studies, where regularization enables the detection of items that violate measurement invariance, also known as differential item functioning (DIF). Because the MIRT model is computationally demanding, we leverage the recently developed Gaussian Variational Expectation–Maximization (GVEM) algorithm to speed up the computation. In particular, the GVEM algorithm is extended to a more complicated and improved multi-group version with categorical covariates and Lasso penalty for re-MIRT, namely, the importance weighted GVEM with one additional maximization step (IW-GVEMM). This study aims to provide empirical evidence to support feasible uses of IW-GVEMM for re-MIRT DIF detection, providing a useful tool for integrative data analysis. Our results show that IW-GVEMM accurately estimates the model, detects DIF items, and finds a more reasonable number of DIF items in a real world dataset. The proposed method has been integrated intoRpackageVEMIRT(https://map-lab-uw.github.io/VEMIRT).more » « less
-
Abstract In the form of multidimensional arrays, tensor data have become increasingly prevalent in modern scientific studies and biomedical applications such as computational biology, brain imaging analysis, and process monitoring system. These data are intrinsically heterogeneous with complex dependencies and structure. Therefore, ad‐hoc dimension reduction methods on tensor data may lack statistical efficiency and can obscure essential findings. Model‐based clustering is a cornerstone of multivariate statistics and unsupervised learning; however, existing methods and algorithms are not designed for tensor‐variate samples. In this article, we propose a tensor envelope mixture model (TEMM) for simultaneous clustering and multiway dimension reduction of tensor data. TEMM incorporates tensor‐structure‐preserving dimension reduction into mixture modeling and drastically reduces the number of free parameters and estimative variability. An expectation‐maximization‐type algorithm is developed to obtain likelihood‐based estimators of the cluster means and covariances, which are jointly parameterized and constrained onto a series of lower dimensional subspaces known as the tensor envelopes. We demonstrate the encouraging empirical performance of the proposed method in extensive simulation studies and a real data application in comparison with existing vector and tensor clustering methods.more » « less
-
Tackling High-Dimensional Tensor Clustering In the paper “Jointly Modeling and Clustering Tensors in High Dimensions,” Cai, Zhang, and Sun address the challenge of jointly modeling and clustering tensors by introducing a high-dimensional tensor mixture model with heterogeneous covariances. The proposed mixture model exploits the intrinsic structures of tensor data. The authors develop a computationally efficient high-dimensional expectation conditional maximization (HECM) algorithm and show that the HECM iterates, with an appropriate initialization, converge geometrically to a neighborhood that is within statistical precision of the true parameter. The theoretical analysis is nontrivial because of the dual nonconvexity arising from both the expectation maximization-type estimation and the nonconvex objective function in the M step. They also study the convergence rate of the algorithm when the number of clusters is overspecified and when the signal-to-noise ratio diminishes with sample size. The efficacy of the proposed method is demonstrated through numerical experiments and a real-world medical data application.more » « less
-
We develop algorithms to automate discovery of stochastic dynamical system models from noisy, vector-valued time series. By discovery, we mean learning both a nonlinear drift vector field and a diagonal diffusion matrix for an Itô stochastic differential equation in Rd . We parameterize the vector field using tensor products of Hermite polynomials, enabling the model to capture highly nonlinear and/or coupled dynamics. We solve the resulting estimation problem using expectation maximization (EM). This involves two steps. We augment the data via diffusion bridge sampling, with the goal of producing time series observed at a higher frequency than the original data. With this augmented data, the resulting expected log likelihood maximization problem reduces to a least squares problem. We provide an open-source implementation of this algorithm. Through experiments on systems with dimensions one through eight, we show that this EM approach enables accurate estimation for multiple time series with possibly irregular observation times. We study how the EM method performs as a function of the amount of data augmentation, as well as the volume and noisiness of the data.more » « less
An official website of the United States government
