skip to main content


Title: Bloklanmış Üç Düzeyli Regresyon Süreksizlik Tasarımlarının Tasarım ve Analizinde Bir Düzeyinin Yok Sayılmasının Sonuçları: Güç ve Tip I Hata Oranları [Consequences of Ignoring a Level of Nesting on Design and Analysis of Blocked Three-level Regression Discontinuity Designs: Power and Type I Error Rates]
Multilevel regression discontinuity designs have been increasingly used in education research to evaluate the effectiveness of policy and programs. It is common to ignore a level of nesting in a three-level data structure (students nested in classrooms/teachers nested in schools), whether unwittingly during data analysis or due to resource constraints during the planning phase. This study investigates the consequences of ignoring intermediate or top level in blocked three-level regression discontinuity designs (BIRD3; treatment is at level 1) during data analysis and planning. Monte Carlo simulation results indicated that ignoring a level during analysis did not affect the accuracy of treatment effect estimates; however, it affected the precision (standard errors, power, and Type I error rates). Ignoring the intermediate level did not cause a significant problem. Power rates were slightly underestimated, whereas Type I error rates were stable. In contrast, ignoring a top-level resulted in overestimated power rates; however, severe inflation in Type I error deemed this strategy ineffective. As for the design phase, when the intermediate level was ignored, it is viable to use parameters from a two-level blocked regression discontinuity model (BIRD2) to plan a BIRD3 design. However, level 2 parameters from the BIRD2 model should be substituted for level 3 parameters in the BIRD3 design. When the top level was ignored, using parameters from the BIRD2 model to plan a BIRD3 design should be avoided.  more » « less
Award ID(s):
1913563
NSF-PAR ID:
10391160
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Adıyaman Üniversitesi Eğitim Bilimleri Dergisi
ISSN:
2149-2727
Page Range / eLocation ID:
42 to 55
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Analysis of the differential treatment effects across targeted subgroups and contexts is a critical objective in many evaluations because it delineates for whom and under what conditions particular programs, therapies or treatments are effective. Unfortunately, it is unclear how to plan efficient and effective evaluations that include these moderated effects when the design includes partial nesting (i.e., disparate grouping structures across treatment conditions). In this study, we develop statistical power formulas to identify requisite sample sizes and guide the planning of evaluations probing moderation under two-level partially nested designs. The results suggest that the power to detect moderation effects in partially nested designs is substantially influenced by sample size, moderation effect size, and moderator variance structure (i.e., varies within groups only or within and between groups). We implement the power formulas in the R-Shiny application PowerUpRShiny and demonstrate their use to plan evaluations.

     
    more » « less
  2. Summary

    In many applications of regression discontinuity designs, the running variable used to assign treatment is only observed with error. We show that, provided the observed running variable (i) correctly classifies treatment assignment and (ii) affects the conditional means of potential outcomes smoothly, ignoring the measurement error nonetheless yields an estimate with a causal interpretation: the average treatment effect for units whose observed running variable equals the cutoff. Possibly after doughnut trimming, these assumptions accommodate a variety of settings where support of the measurement error is not too wide. An empirical application illustrates the results for both sharp and fuzzy designs.

     
    more » « less
  3. Summary

    Power analyses are an important aspect of experimental design, because they help determine how experiments are implemented in practice. It is common to specify a desired level of power and compute the sample size necessary to obtain that power. Such calculations are well known for completely randomized experiments, but there can be many benefits to using other experimental designs. For example, it has recently been established that rerandomization, where subjects are randomized until covariate balance is obtained, increases the precision of causal effect estimators. This work establishes the power of rerandomized treatment-control experiments, thereby allowing for sample size calculators. We find the surprising result that, while power is often greater under rerandomization than complete randomization, the opposite can occur for very small treatment effects. The reason is that inference under rerandomization can be relatively more conservative, in the sense that it can have a lower Type-I error at the same nominal significance level, and this additional conservativeness adversely affects power. This surprising result is due to treatment effect heterogeneity, a quantity often ignored in power analyses. We find that heterogeneity increases power for large effect sizes, but decreases power for small effect sizes.

     
    more » « less
  4. Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A simulation study was performed to compare the new framework with traditional logistic regression, with respect to Type I error and power rates of the uniform DIF test statistics and bias and root mean square error of the corresponding effect size estimators. The new framework better controlled the Type I error rate and demonstrated minimal bias but suffered from low power and lack of precision. Implications for practice are discussed. 
    more » « less
  5. Abstract

    Despite significant investments in watershed‐scale restoration projects, evaluation of their impacts on salmonids is often limited by inadequate experimental design. This project aimed to strengthen study designs by identifying and quantifying sources of temporal and spatial uncertainty while assessing population‐level salmonid responses in Before‐After‐Control‐Impact (BACI) restoration experiments. To evaluate sources of temporal uncertainty, meta‐analysis of 32 annual BACI experiments from the Pacific Northwest, USA was conducted. Experimental error was determined to be a function of the total temporal variation of both restoration and control salmonid population metrics and the degree of covariation, or synchrony, between these metrics (r2 = 1). However, synchrony was both weak ( = 0.18) and unrelated to experimental error (r = 0.01) while temporal variability was found to account for 91% of this error. Because synchrony did not reduce experimental error, we conclude that BACI designs will not normally exhibit greater power over uncontrolled Before‐After (BA) designs. To evaluate spatial uncertainty, hierarchical BACI designs were simulated. It was found that spatial variability of hypothetical steelhead (Oncorhynchus mykiss) growth values within watersheds can cause mis‐estimation of the restoration effect and reduce power. While hierarchical BACI designs can examine both reach and watershed‐scale restoration effects simultaneously, due to probable mis‐estimation of the restoration effect size, these scales should be examined separately. Paired‐reach designs such as Extensive Post‐Treatment (EPT) provide powerful replicated local‐scale restoration experiments, which can build understanding of restoration‐ecological mechanisms. Knowledge gained from reach‐scale experiments should then be implemented on watershed‐scales and monitored within a non‐hierarchical framework.

     
    more » « less