skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Design and analysis of cluster randomized trials
Cluster randomized trials (CRTs) are commonly used to evaluate the causal effects of educational interventions, where the entire clusters (e.g., schools) are randomly assigned to treatment or control conditions. This study introduces statistical methods for designing and analyzing two-level (e.g., students nested within schools) and three-level (e.g., students nested within classrooms nested within schools) CRTs. Specifically, we utilize hierarchical linear models (HLMs) to account for the dependency of the intervention participants within the same clusters, estimating the average treatment effects (ATEs) of educational interventions and other effects of interest (e.g., moderator and mediator effects). We demonstrate methods and tools for sample size planning and statistical power analysis. Additionally, we discuss common challenges and potential solutions in the design and analysis phases, including the effects of omitting one level of clustering, non-compliance, threats to external validity, and cost-effectiveness of the intervention. We conclude with some practical suggestions for CRT design and analysis, along with recommendations for further readings.  more » « less
Award ID(s):
2000705
PAR ID:
10629096
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Springer Nature Link
Date Published:
Journal Name:
Asia Pacific Education Review
Volume:
25
Issue:
3
ISSN:
1598-1037
Page Range / eLocation ID:
685 to 701
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This study introduces recent advances in statistical power analysis methods and tools for designing and analyzing randomized cost-effectiveness trials (RCETs) to evaluate the causal effects and costs of social work interventions. The article focuses on two-level designs, where, for example, students are nested within schools, with interventions applied either at the school level (cluster design) or student level (multisite design). We explore three statistical modeling strategies—random-effects, constant-effects, and fixed-effects models—to assess the cost-effectiveness of interventions, and we develop corresponding power analysis methods and tools. Power is influenced by effect size, sample sizes, and design parameters. We developed a user-friendly tool, PowerUp!-CEA, to aid researchers in planning RCETs. When designing RCETs, it is crucial to consider cost variance, its nested effects, and the covariance between effectiveness and cost data, as neglecting these factors may lead to underestimated power. 
    more » « less
  2. Researchers often apply moderation analyses to examine whether the effects of an intervention differ conditional on individual or cluster moderator variables such as gender, pretest, or school size. This study develops formulas for power analyses to detect moderator effects in two-level cluster randomized trials (CRTs) using linear models. We derive the formulas for estimating statistical power, minimum detectable effect size difference and 95% confidence intervals for cluster- and individual-level moderators. Our framework accommodates binary or continuous moderators, designs with or without covariates, and effects of individual-level moderators that vary randomly or nonrandomly across clusters. A small Monte Carlo simulation confirms the accuracy of our formulas. We also compare power between main effect analysis and moderation analysis, discuss the effects of mis-specification of the moderator slope (randomly vs. non-randomly varying), and conclude with directions for future research. We provide software for conducting a power analysis of moderator effects in CRTs. 
    more » « less
  3. Optimal treatment regimes (OTRs) have been widely employed in computer science and personalized medicine to provide data-driven, optimal recommendations to individuals. However, previous research on OTRs has primarily focused on settings that are independent and identically distributed, with little attention given to the unique characteristics of educational settings, where students are nested within schools and there are hierarchical dependencies. The goal of this study is to propose a framework for designing OTRs from multisite randomized trials, a commonly used experimental design in education and psychology to evaluate educational programs. We investigate modifications to popular OTR methods, specifically Q-learning and weighting methods, in order to improve their performance in multisite randomized trials. A total of 12 modifications, 6 for Q-learning and 6 for weighting, are proposed by utilizing different multilevel models, moderators, and augmentations. Simulation studies reveal that all Q-learning modifications improve performance in multisite randomized trials and the modifications that incorporate random treatment effects show the most promise in handling cluster-level moderators. Among weighting methods, the modification that incorporates cluster dummies into moderator variables and augmentation terms performs best across simulation conditions. The proposed modifications are demonstrated through an application to estimate an OTR of conditional cash transfer programs using a multisite randomized trial in Colombia to maximize educational attainment. 
    more » « less
  4. null (Ed.)
    Abstract Given the growing presence of additive manufacturing (AM) processes in engineering design and manufacturing, there has emerged an increased interest in introducing AM and design for AM (DfAM) educational interventions in engineering education. Several researchers have proposed AM and DfAM educational interventions; however, some argue that these efforts might not be sufficient to develop higher-level skills among engineers (e.g., identifying design opportunities that leverage AM capabilities). Prior work has shown that longer, distributed educational interventions are more effective in encouraging learning and information retention; however, these interventions could also be time-consuming and expensive to implement. Therefore, there is a need to test the effectiveness of longer, distributed DfAM educational interventions compared to shorter, lecture-style interventions. Our aim in this research is to explore this research gap through an experimental study. Specifically, we compared two variations of a DfAM educational intervention: (1) a module-style intervention spread over two sessions with the introduction of DfAM evaluation metrics, and (2) a lecture-style intervention completed in a single session with no evaluation metrics introduced. From our results, we see that students who received the module-style intervention reported a greater increase in their DfAM self-efficacy. Additionally, students who received the module-style intervention reported having given a greater emphasis on part consolidation and feature size. Finally, we observe that the structure of the educational intervention did not influence the creativity of ideas generated by the participants. These findings highlight the utility of module-style DfAM educational interventions towards increasing DfAM self-efficacy, but not necessarily design creativity. Moreover, these findings highlight the need to formulate educational interventions that are effective and efficient. 
    more » « less
  5. The learning environment is a complex environment for conducting evaluations. Students, teachers, and schools all play a role in shaping the environment, which can lead to differences in student achievement. Hence designing an evaluation with the capacity to detect the effectiveness of an educational intervention on student achievement requires careful planning. A key part of the planning is a statistical power analysis that considers the multilevel nature of the design. This study provides empirical estimates of design parameters necessary for planning adequately powered cluster randomized trials that include the student, teacher, and school level and are focused on reading, mathematics, or science achievement. The sample in our study includes administrative state datasets from Michigan, North Carolina, Kentucky, and Maryland for grades 3 through 8. The results showed that, with few exceptions, the variance in student test scores is larger between teachers within schools than between schools. 
    more » « less