skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Sample Size Requirements for Applying Diagnostic Classification Models
Results of a comprehensive simulation study are reported investigating the effects of sample size, test length, number of attributes and base rate of mastery on item parameter recovery and classification accuracy of four DCMs (i.e., C-RUM, DINA, DINO, and LCDMREDUCED). Effects were evaluated using bias and RMSE computed between true (i.e., generating) parameters and estimated parameters. Effects of simulated factors on attribute assignment were also evaluated using the percentage of classification accuracy. More precise estimates of item parameters were obtained with larger sample size and longer test length. Recovery of item parameters decreased as the number of attributes increased from three to five but base rate of mastery had a varying effect on the item recovery. Item parameter and classification accuracy were higher for DINA and DINO models.  more » « less
Award ID(s):
1813760
PAR ID:
10288223
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Frontiers in Psychology
Volume:
11
ISSN:
1664-1078
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The purpose of this study was to examine the effects of different data conditions on item parameter recovery and classification accuracy of three dichotomous mixture item response theory (IRT) models: the Mix1PL, Mix2PL, and Mix3PL. Manipulated factors in the simulation included the sample size (11 different sample sizes from 100 to 5000), test length (10, 30, and 50), number of classes (2 and 3), the degree of latent class separation (normal/no separation, small, medium, and large), and class sizes (equal vs. nonequal). Effects were assessed using root mean square error (RMSE) and classification accuracy percentage computed between true parameters and estimated parameters. The results of this simulation study showed that more precise estimates of item parameters were obtained with larger sample sizes and longer test lengths. Recovery of item parameters decreased as the number of classes increased with the decrease in sample size. Recovery of classification accuracy for the conditions with two-class solutions was also better than that of three-class solutions. Results of both item parameter estimates and classification accuracy differed by model type. More complex models and models with larger class separations produced less accurate results. The effect of the mixture proportions also differentially affected RMSE and classification accuracy results. Groups of equal size produced more precise item parameter estimates, but the reverse was the case for classification accuracy results. Results suggested that dichotomous mixture IRT models required more than 2,000 examinees to be able to obtain stable results as even shorter tests required such large sample sizes for more precise estimates. This number increased as the number of latent classes, the degree of separation, and model complexity increased. 
    more » « less
  2. Abstract Online calibration estimates new item parameters alongside previously calibrated items, supporting efficient item replenishment. However, most existing online calibration procedures for Cognitive Diagnostic Computerized Adaptive Testing (CD‐CAT) lack mechanisms to ensure content balance during live testing. This limitation can lead to uneven content coverage, potentially undermining the alignment with instructional goals. This research extends the current calibration framework by integrating a two‐phase test design with a content‐balancing item selection method into the online calibration procedure. Simulation studies evaluated item parameter recovery and attribute profile estimation accuracy under the proposed procedure. Results indicated that the developed procedure yielded more accurate new item parameter estimates. The procedure also maintained content representativeness under both balanced and unbalanced constraints. Attribute profile estimation was sensitive to item parameter values. Accuracy declined when items had larger parameter values. Calibration improved with larger sample sizes and smaller parameter values. Longer test lengths contributed more to profile estimation than to new item calibration. These findings highlight design trade‐offs in adaptive item replenishment and suggest new directions for hybrid calibration methods. 
    more » « less
  3. null (Ed.)
    Selected response items and constructed response (CR) items are often found in the same test. Conventional psychometric models for these two types of items typically focus on using the scores for correctness of the responses. Recent research suggests, however, that more information may be available from the CR items than just scores for correctness. In this study, we describe an approach in which a statistical topic model along with a diagnostic classification model (DCM) was applied to a mixed item format formative test of English and Language Arts. The DCM was used to estimate students’ mastery status of reading skills. These mastery statuses were then included in a topic model as covariates to predict students’ use of each of the latent topics in their written answers to a CR item. This approach enabled investigation of the effects of mastery status of reading skills on writing patterns. Results indicated that one of the skills, Integration of Knowledge and Ideas, helped detect and explain students’ writing patterns with respect to students’ use of individual topics. 
    more » « less
  4. Abstract Recent empirical studies have quantified correlation between survival and recovery by estimating these parameters as correlated random effects with hierarchical Bayesian multivariate models fit to tag‐recovery data. In these applications, increasingly negative correlation between survival and recovery has been interpreted as evidence for increasingly additive harvest mortality. The power of these hierarchal models to detect nonzero correlations has rarely been evaluated, and these few studies have not focused on tag‐recovery data, which is a common data type. We assessed the power of multivariate hierarchical models to detect negative correlation between annual survival and recovery. Using three priors for multivariate normal distributions, we fit hierarchical effects models to a mallard (Anas platyrhychos) tag‐recovery data set and to simulated data with sample sizes corresponding to different levels of monitoring intensity. We also demonstrate more robust summary statistics for tag‐recovery data sets than total individuals tagged. Different priors led to substantially different estimates of correlation from the mallard data. Our power analysis of simulated data indicated most prior distribution and sample size combinations could not estimate strongly negative correlation with useful precision or accuracy. Many correlation estimates spanned the available parameter space (−1,1) and underestimated the magnitude of negative correlation. Only one prior combined with our most intensive monitoring scenario provided reliable results. Underestimating the magnitude of correlation coincided with overestimating the variability of annual survival, but not annual recovery. The inadequacy of prior distributions and sample size combinations previously assumed adequate for obtaining robust inference from tag‐recovery data represents a concern in the application of Bayesian hierarchical models to tag‐recovery data. Our analysis approach provides a means for examining prior influence and sample size on hierarchical models fit to capture–recapture data while emphasizing transferability of results between empirical and simulation studies. 
    more » « less
  5. Abstract Resilience is broadly understood as the ability of an ecological system to resist and recover from perturbations acting on species abundances and on the system's structure. However, one of the main problems in assessing resilience is to understand the extent to which measures of recovery and resistance provide complementary information about a system. While recovery from abundance perturbations has a strong tradition under the analysis of dynamical stability, it is unclear whether this same formalism can be used to measure resistance to structural perturbations (e.g. perturbations to model parameters).Here, we provide a framework grounded on dynamical and structural stability in Lotka–Volterra systems to link recovery from small perturbations on species abundances (i.e. dynamical indicators) with resistance to parameter perturbations of any magnitude (i.e. structural indicators). We use theoretical and experimental multispecies systems to show that the faster the recovery from abundance perturbations, the higher the resistance to parameter perturbations.We first use theoretical systems to show that the return rate along the slowest direction after a small random abundance perturbation (what we call full recovery) is negatively correlated with the largest random parameter perturbation that a system can withstand before losing any species (what we call full resistance). We also show that the return rate along the second fastest direction after a small random abundance perturbation (what we call partial recovery) is negatively correlated with the largest random parameter perturbation that a system can withstand before at most one species survives (what we call partial resistance). Then, we use a dataset of experimental microbial systems to confirm our theoretical expectations and to demonstrate that full and partial components of resilience are complementary.Our findings reveal that we can obtain the same level of information about resilience by measuring either a dynamical (i.e. recovery) or a structural (i.e. resistance) indicator. Irrespective of the chosen indicator (dynamical or structural), our results show that we can obtain additional information by separating the indicator into its full and partial components. We believe these results can motivate new theoretical approaches and empirical analyses to increase our understanding about risk in ecological systems. 
    more » « less