Summary In many applications of regression discontinuity designs, the running variable used to assign treatment is only observed with error. We show that, provided the observed running variable (i) correctly classifies treatment assignment and (ii) affects the conditional means of potential outcomes smoothly, ignoring the measurement error nonetheless yields an estimate with a causal interpretation: the average treatment effect for units whose observed running variable equals the cutoff. Possibly after doughnut trimming, these assumptions accommodate a variety of settings where support of the measurement error is not too wide. An empirical application illustrates the results for both sharp and fuzzy designs.
more »
« less
Bloklanmış Üç Düzeyli Regresyon Süreksizlik Tasarımlarının Tasarım ve Analizinde Bir Düzeyinin Yok Sayılmasının Sonuçları: Güç ve Tip I Hata Oranları [Consequences of Ignoring a Level of Nesting on Design and Analysis of Blocked Three-level Regression Discontinuity Designs: Power and Type I Error Rates]
Multilevel regression discontinuity designs have been increasingly used in education research to evaluate the effectiveness of policy and programs. It is common to ignore a level of nesting in a three-level data structure (students nested in classrooms/teachers nested in schools), whether unwittingly during data analysis or due to resource constraints during the planning phase. This study investigates the consequences of ignoring intermediate or top level in blocked three-level regression discontinuity designs (BIRD3; treatment is at level 1) during data analysis and planning. Monte Carlo simulation results indicated that ignoring a level during analysis did not affect the accuracy of treatment effect estimates; however, it affected the precision (standard errors, power, and Type I error rates). Ignoring the intermediate level did not cause a significant problem. Power rates were slightly underestimated, whereas Type I error rates were stable. In contrast, ignoring a top-level resulted in overestimated power rates; however, severe inflation in Type I error deemed this strategy ineffective. As for the design phase, when the intermediate level was ignored, it is viable to use parameters from a two-level blocked regression discontinuity model (BIRD2) to plan a BIRD3 design. However, level 2 parameters from the BIRD2 model should be substituted for level 3 parameters in the BIRD3 design. When the top level was ignored, using parameters from the BIRD2 model to plan a BIRD3 design should be avoided.
more »
« less
- Award ID(s):
- 1913563
- PAR ID:
- 10391160
- Date Published:
- Journal Name:
- Adıyaman Üniversitesi Eğitim Bilimleri Dergisi
- ISSN:
- 2149-2727
- Page Range / eLocation ID:
- 42 to 55
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A simulation study was performed to compare the new framework with traditional logistic regression, with respect to Type I error and power rates of the uniform DIF test statistics and bias and root mean square error of the corresponding effect size estimators. The new framework better controlled the Type I error rate and demonstrated minimal bias but suffered from low power and lack of precision. Implications for practice are discussed.more » « less
-
This study introduces recent advances in statistical power analysis methods and tools for designing and analyzing randomized cost-effectiveness trials (RCETs) to evaluate the causal effects and costs of social work interventions. The article focuses on two-level designs, where, for example, students are nested within schools, with interventions applied either at the school level (cluster design) or student level (multisite design). We explore three statistical modeling strategies—random-effects, constant-effects, and fixed-effects models—to assess the cost-effectiveness of interventions, and we develop corresponding power analysis methods and tools. Power is influenced by effect size, sample sizes, and design parameters. We developed a user-friendly tool, PowerUp!-CEA, to aid researchers in planning RCETs. When designing RCETs, it is crucial to consider cost variance, its nested effects, and the covariance between effectiveness and cost data, as neglecting these factors may lead to underestimated power.more » « less
-
Summary Power analyses are an important aspect of experimental design, because they help determine how experiments are implemented in practice. It is common to specify a desired level of power and compute the sample size necessary to obtain that power. Such calculations are well known for completely randomized experiments, but there can be many benefits to using other experimental designs. For example, it has recently been established that rerandomization, where subjects are randomized until covariate balance is obtained, increases the precision of causal effect estimators. This work establishes the power of rerandomized treatment-control experiments, thereby allowing for sample size calculators. We find the surprising result that, while power is often greater under rerandomization than complete randomization, the opposite can occur for very small treatment effects. The reason is that inference under rerandomization can be relatively more conservative, in the sense that it can have a lower Type-I error at the same nominal significance level, and this additional conservativeness adversely affects power. This surprising result is due to treatment effect heterogeneity, a quantity often ignored in power analyses. We find that heterogeneity increases power for large effect sizes, but decreases power for small effect sizes.more » « less
-
Background:Evaluation studies frequently draw on fallible outcomes that contain significant measurement error. Ignoring outcome measurement error in the planning stages can undermine the sufficiency and efficiency of an otherwise well-designed study and can further constrain the evidence studies bring to bear on the effectiveness of programs. Objectives:We develop simple formulas to adjust statistical power, minimum detectable effect (MDE), and optimal sample allocation formulas for two-level cluster- and multisite-randomized designs when the outcome is subject to measurement error. Results:The resulting adjusted formulas suggest that outcome measurement error typically amplifies treatment effect uncertainty, reduces power, increases the MDE, and undermines the efficiency of conventional optimal sampling schemes. Therefore, achieving adequate power for a given effect size will typically demand increased sample sizes when considering fallible outcomes, while maintaining design efficiency will require increasing portions of a budget be applied toward sampling a larger number of individuals within clusters. We illustrate evaluation planning with the new formulas while comparing them to conventional formulas using hypothetical examples based on recent empirical studies. To encourage adoption of the new formulas, we implement them in the R package PowerUpR and in the PowerUp software.more » « less
An official website of the United States government

