skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on February 21, 2026

Title: Decomposition of WAIC for assessing the information gain with application to educational testing
Abstract Nowadays, multidimensional data are often available from educational testing. One natural issue is to identify whether more dimensional data are useful in fitting the item response data. To address this important issue, we develop a new decomposition of Widely Applicable Information Criterion (WAIC) via the posterior predictive ordinate (PPO) under the joint model for the response, response time and two additional educational testing scores. Based on this decomposition, a new model assessment criterion is then proposed, which allows us to determine which of the response time and two additional scores are most useful in fitting the response data and whether other dimensional data are further needed given that one of these dimensional data is already included in the joint model with the response data. In addition, an efficient Monte Carlo method is developed to compute PPO. An extensive simulation study is conducted to examine the empirical performance of the proposed joint model and the model assessment criterion in the psychological setting. The proposed methodology is further applied to an analysis of a real dataset from a computerized educational assessment program.  more » « less
Award ID(s):
1848451
PAR ID:
10636507
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Wiley
Date Published:
Journal Name:
British Journal of Mathematical and Statistical Psychology
ISSN:
0007-1102
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. von Davier, Matthias (Ed.)
    Computerized assessment provides rich multidimensional data including trial-by-trial accuracy and response time (RT) measures. A key question in modeling this type of data is how to incorporate RT data, for example, in aid of ability estimation in item response theory (IRT) models. To address this, we propose a joint model consisting of a two-parameter IRT model for the dichotomous item response data, a log-normal model for the continuous RT data, and a normal model for corresponding paper-and-pencil scores. Then, we reformulate and reparameterize the model to capture the relationship between the model parameters, to facilitate the prior specification, and to make the Bayesian computation more efficient. Further, we propose several new model assessment criteria based on the decomposition of deviance information criterion (DIC) the logarithm of the pseudo-marginal likelihood (LPML). The proposed criteria can quantify the improvement in the fit of one part of the multidimensional data given the other parts. Finally, we have conducted several simulation studies to examine the empirical performance of the proposed model assessment criteria and have illustrated the application of these criteria using a real dataset from a computerized educational assessment program. 
    more » « less
  2. Multidimensional Item Response Theory (MIRT) is widely used in educational and psychological assessment and evaluation. With the increasing size of modern assessment data, many existing estimation methods become computationally demanding and hence they are not scalable to big data, especially for the multidimensional three-parameter and four-parameter logistic models (i.e., M3PL and M4PL). To address this issue, we propose an importance-weighted sampling enhanced Variational Autoencoder (VAE) approach for the estimation of M3PL and M4PL. The key idea is to adopt a variational inference procedure in machine learning literature to approximate the intractable marginal likelihood, and further use importance-weighted samples to boost the trained VAE with a better log-likelihood approximation. Simulation studies are conducted to demonstrate the computational efficiency and scalability of the new algorithm in comparison to the popular alternative algorithms, i.e., Monte Carlo EM and Metropolis-Hastings Robbins-Monro methods. The good performance of the proposed method is also illustrated by a NAEP multistage testing data set. 
    more » « less
  3. This paper proposes a classification scheme for categorization of PDC educational resources. We have also proposed an evaluation framework for assessing the PDC resources. Under the proposed framework, each resource type has a set of criteria and an associated score. A PDC resource will obtain a score if evaluated under our proposed framework that is the sum of the scores of the criteria that the resource satisfies. The evaluation of whether a resource met a criterion is subjective. We have also presented our evaluation of PDC educational resources appropriate for CS1, CS2 (Computer Science 1 and 2), and DS/A (Data Structures and Algorithms) available on the web using our proposed framework. 
    more » « less
  4. Exoskeletons and robots have been used as a common practice to assist and automate rehabilitation exercises. Exoskeleton fitting and alignments are important factors and challenges that need to be addressed for smooth and safe operations and better outcomes. Such challenges often dictate the exoskeleton design approaches. Some focus on simplifying and mimicking human joints (joint-based) while others have a focus on a specific task (task-based), which does not need to align with the corresponding limb joint/s to generate the desired anatomical motion. In this study, the two design approaches are assessed in an elbow flexion-extension task. The muscle responses have been collected and compared with and without the exoskeletons. Based on 6 with no disability participants, the normalized Electromyography (EMG) RMS values are plotted. The plot profiles and magnitudes are used as a base to assess the exoskeleton alignment. For this specific task, the task-based exoskeleton has shown a profile closer to the one without exoskeleton with a relatively identical support as the joint-based one; the latter is evidenced through most subjects’ muscle response magnitudes. This preliminary data has shown a good methodology and insight towards the assessment of exoskeletons, but more human subject data is needed with different task combinations to further strengthen the findings. 
    more » « less
  5. null (Ed.)
    To overcome the curse of dimensionality in joint probability learning, recent work has proposed to recover the joint probability mass function (PMF) of an arbitrary number of random variables (RVs) from three-dimensional marginals, exploiting the uniqueness of tensor decomposition and the (unknown) dependence among the RVs. Nonetheless, accurately estimating three-dimensional marginals is still costly in terms of sample complexity. Tensor decomposition also poses a computationally intensive optimization problem. This work puts forth a new framework that learns the joint PMF using pairwise marginals that are relatively easy to acquire. The method is built upon nonnegative matrix factorization (NMF) theory, and features a Gram–Schmidt-like economical algorithm that works provably well under realistic conditions. Theoretical analysis of a recently proposed expectation maximization (EM) algorithm for joint PMF recovery is also presented. In particular, the EM algorithm is shown to provably improve upon the proposed pairwise marginal-based approach. Synthetic and real-data experiments are employed to showcase the effectiveness of the proposed approach. 
    more » « less