skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Is It Who You Are or Where You Are? Accounting for Compositional Differences in Cross-Site Treatment Effect Variation
In multisite trials, learning about treatment effect variation across sites is critical for understanding where and for whom a program works. Unadjusted comparisons, however, capture “compositional” differences in the distributions of unit-level features as well as “contextual” differences in site-level features, including possible differences in program implementation. Our goal in this article is to adjust site-level estimates for differences in the distribution of observed unit-level features: If we can reweight (or “transport”) each site to have a common distribution of observed unit-level covariates, the remaining treatment effect variation captures contextual and unobserved compositional differences across sites. This allows us to make apples-to-apples comparisons across sites, parceling out the amount of cross-site effect variation explained by systematic differences in populations served. In this article, we develop a framework for transporting effects using approximate balancing weights, where the weights are chosen to directly optimize unit-level covariate balance between each site and the common target distribution. We first develop our approach for the general setting of transporting the effect of a single-site trial. We then extend our method to multisite trials, assess its performance via simulation, and use it to analyze a series of multisite trials of adult education and vocational training programs. In our application, we find that distributional differences are potentially masking cross-site variation. Our method is available in the balancer R package.  more » « less
Award ID(s):
1745640
PAR ID:
10531800
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Sage
Date Published:
Journal Name:
Journal of Educational and Behavioral Statistics
Volume:
48
Issue:
4
ISSN:
1076-9986
Page Range / eLocation ID:
420 to 453
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Past research has demonstrated that treatment effects frequently vary across sites (e.g., schools) and that such variation can be explained by site-level or individual-level variables (e.g., school size or gender). The purpose of this study is to develop a statistical framework and tools for the effective and efficient design of multisite randomized trials (MRTs) probing moderated treatment effects. The framework considers three core facets of such designs: (a) Level 1 and Level 2 moderators, (b) random and nonrandomly varying slopes (coefficients) of the treatment variable and its interaction terms with the moderators, and (c) binary and continuous moderators. We validate the formulas for calculating statistical power and the minimum detectable effect size difference with simulations, probe its sensitivity to model assumptions, execute the formulas in accessible software, demonstrate an application, and provide suggestions in designing MRTs probing moderated treatment effects. 
    more » « less
  2. Optimal treatment regimes (OTRs) have been widely employed in computer science and personalized medicine to provide data-driven, optimal recommendations to individuals. However, previous research on OTRs has primarily focused on settings that are independent and identically distributed, with little attention given to the unique characteristics of educational settings, where students are nested within schools and there are hierarchical dependencies. The goal of this study is to propose a framework for designing OTRs from multisite randomized trials, a commonly used experimental design in education and psychology to evaluate educational programs. We investigate modifications to popular OTR methods, specifically Q-learning and weighting methods, in order to improve their performance in multisite randomized trials. A total of 12 modifications, 6 for Q-learning and 6 for weighting, are proposed by utilizing different multilevel models, moderators, and augmentations. Simulation studies reveal that all Q-learning modifications improve performance in multisite randomized trials and the modifications that incorporate random treatment effects show the most promise in handling cluster-level moderators. Among weighting methods, the modification that incorporates cluster dummies into moderator variables and augmentation terms performs best across simulation conditions. The proposed modifications are demonstrated through an application to estimate an OTR of conditional cash transfer programs using a multisite randomized trial in Colombia to maximize educational attainment. 
    more » « less
  3. Abstract Human activities are altering ecological communities around the globe. Understanding the implications of these changes requires that we consider the composition of those communities. However, composition can be summarized by many metrics which in turn are influenced by different ecological processes. For example, incidence‐based metrics strongly reflect species gains or losses, while abundance‐based metrics are minimally affected by changes in the abundance of small or uncommon species. Furthermore, metrics might be correlated with different predictors. We used a globally distributed experiment to examine variation in species composition within 60 grasslands on six continents. Each site had an identical experimental and sampling design: 24 plots × 4 years. We expressed compositional variation within each site—not across sites—using abundance‐ and incidence‐based metrics of the magnitude of dissimilarity (Bray–Curtis and Sorensen, respectively), abundance‐ and incidence‐based measures of the relative importance of replacement (balanced variation and species turnover, respectively), and species richness at two scales (per plot‐year [alpha] and per site [gamma]). Average compositional variation among all plot‐years at a site was high and similar to spatial variation among plots in the pretreatment year, but lower among years in untreated plots. For both types of metrics, most variation was due to replacement rather than nestedness. Differences among sites in overall within‐site compositional variation were related to several predictors. Environmental heterogeneity (expressed as the CV of total aboveground plant biomass in unfertilized plots of the site) was an important predictor for most metrics. Biomass production was a predictor of species turnover and of alpha diversity but not of other metrics. Continentality (measured as annual temperature range) was a strong predictor of Sorensen dissimilarity. Metrics of compositional variation are moderately correlated: knowing the magnitude of dissimilarity at a site provides little insight into whether the variation is driven by replacement processes. Overall, our understanding of compositional variation at a site is enhanced by considering multiple metrics simultaneously. Monitoring programs that explicitly incorporate these implications, both when designing sampling strategies and analyzing data, will have a stronger ability to understand the compositional variation of systems and to quantify the impacts of human activities. 
    more » « less
  4. Wild pollinator declines are increasingly linked to pesticide exposure, yet it is unclear how intraspecific differences contribute to observed variation in sensitivity, and the role gut microbes play in the sensitivity of wild bees is largely unexplored. Here, we investigate site-level differences in survival and microbiome structure of a wild bumble bee exposed to multiple pesticides, both individually and in combination. We collected wildBombus vosnesenskiiforagers (N= 175) from an alpine meadow, a valley lake shoreline and a suburban park and maintained them on a diet containing a herbicide (glyphosate), a fungicide (tebuconazole), an insecticide (imidacloprid) or a combination of these chemicals. Alpine bees had the highest overall survival, followed by shoreline bees then suburban bees. This was in part explained by body size differences across sites and the presence of conopid parasitoids at two of the sites. Notably, site of origin impacted bee survival on the herbicide, fungicide and combination treatment. We did not find evidence of gut microbiome differences across pesticide treatment, nor a site-by-treatment interaction. Regardless, the survival differences we observed emphasize the importance of considering population of origin when studying pesticide toxicity of wild bees. 
    more » « less
  5. Abstract Randomization inference is a powerful tool in early phase vaccine trials when estimating the causal effect of a regimen against a placebo or another regimen. Randomization-based inference often focuses on testing either Fisher’s sharp null hypothesis of no treatment effect for any participant or Neyman’s weak null hypothesis of no sample average treatment effect. Many recent efforts have explored conducting exact randomization-based inference for other summaries of the treatment effect profile, for instance, quantiles of the treatment effect distribution function. In this article, we systematically review methods that conduct exact, randomization-based inference for quantiles of individual treatment effects (ITEs) and extend some results to a special case where naïve participants are expected not to exhibit responses to highly specific endpoints. These methods are suitable for completely randomized trials, stratified completely randomized trials, and a matched study comparing two non-randomized arms from possibly different trials. We evaluate the usefulness of these methods using synthetic data in simulation studies. Finally, we apply these methods to HIV Vaccine Trials Network Study 086 (HVTN 086) and HVTN 205 and showcase a wide range of application scenarios of the methods.Rcode that replicates all analyses in this article can be found in first author’s GitHub page athttps://github.com/Zhe-Chen-1999/ITE-Inference. 
    more » « less