skip to main content


Title: Multisite Causal Mediation Analysis in the Presence of Complex Sample and Survey Designs and Non-Random Non-Response
Summary

This study provides a template for multisite causal mediation analysis using a comprehensive weighting-based analytic procedure that enhances external and internal validity. The template incorporates a sample weight to adjust for complex sample and survey designs, adopts an inverse probability of treatment weight to adjust for differential treatment assignment probabilities, employs an estimated non-response weight to account for non-random non-response and utilizes a propensity-score-based weighting strategy to decompose flexibly not only the population average but also the between-site heterogeneity of the total programme impact. Because the identification assumptions are not always warranted, a weighting-based balance checking procedure assesses the remaining overt bias, whereas a weighting-based sensitivity analysis further evaluates the potential bias related to omitted confounding or to propensity score model misspecification. We derive the asymptotic variance of the estimators for the causal effects that account for the sampling uncertainty in the estimated weights. The method is applied to a reanalysis of the data from the National Job Corps Study.

 
more » « less
NSF-PAR ID:
10400113
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series A: Statistics in Society
Volume:
182
Issue:
4
ISSN:
0964-1998
Page Range / eLocation ID:
p. 1343-1370
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This study provides a template for multisite causal mediation analysis using a comprehensive weighting-based analytic procedure that enhances external and internal validity. The template incorporates a sample weight to adjust for complex sample and survey designs, adopts an IPTW weight to adjust for differential treatment assignment probabilities, employs an estimated nonresponse weight to account for non-random nonresponse, and utilizes a propensity score-based weighting strategy to flexibly decompose not only the population average but also the between-site heterogeneity of the total program impact. Because the identification assumptions are not always warranted, a weighting-based balance checking procedure assesses the remaining overt bias, while a weighting-based sensitivity analysis further evaluates the potential bias related to omitted confounding or to propensity score model misspecification. We derive the asymptotic variance of the estimators for the causal effects that account for the sampling uncertainty in the estimated weights. The method is applied to a re-analysis of the data from the National Job Corps Study. 
    more » « less
  2. For large observational studies lacking a control group (unlike randomized controlled trials, RCT), propensity scores (PS) are often the method of choice to account for pre-treatment confounding in baseline characteristics, and thereby avoid substantial bias in treatment estimation. A vast majority of PS techniques focus on average treatment effect estimation, without any clear consensus on how to account for confounders, especially in a multiple treatment setting. Furthermore, for time-to event outcomes, the analytical framework is further complicated in presence of high censoring rates (sometimes, due to non-susceptibility of study units to a disease), imbalance between treatment groups, and clustered nature of the data (where, survival outcomes appear in groups). Motivated by a right-censored kidney transplantation dataset derived from the United Network of Organ Sharing (UNOS), we investigate and compare two recent promising PS procedures, (a) the generalized boosted model (GBM), and (b) the covariate-balancing propensity score (CBPS), in an attempt to decouple the causal effects of treatments (here, study subgroups, such as hepatitis C virus (HCV) positive/negative donors, and positive/negative recipients) on time to death of kidney recipients due to kidney failure, post transplantation. For estimation, we employ a 2-step procedure which addresses various complexities observed in the UNOS database within a unified paradigm. First, to adjust for the large number of confounders on the multiple sub-groups, we fit multinomial PS models via procedures (a) and (b). In the next stage, the estimated PS is incorporated into the likelihood of a semi-parametric cure rate Cox proportional hazard frailty model via inverse probability of treatment weighting, adjusted for multi-center clustering and excess censoring, Our data analysis reveals a more informative and superior performance of the full model in terms of treatment effect estimation, over sub-models that relaxes the various features of the event time dataset. 
    more » « less
  3. Abstract

    In non-experimental research, a sensitivity analysis helps determine whether a causal conclusion could be easily reversed in the presence of hidden bias. A new approach to sensitivity analysis on the basis of weighting extends and supplements propensity score weighting methods for identifying the average treatment effect for the treated (ATT). In its essence, the discrepancy between a new weight that adjusts for the omitted confounders and an initial weight that omits them captures the role of the confounders. This strategy is appealing for a number of reasons including that, regardless of how complex the data generation functions are, the number of sensitivity parameters remains small and their forms never change. A graphical display of the sensitivity parameter values facilitates a holistic assessment of the dominant potential bias. An application to the well-known LaLonde data lays out the implementation procedure and illustrates its broad utility. The data offer a prototypical example of non-experimental evaluations of the average impact of job training programmes for the participant population.

     
    more » « less
  4. This study investigates appropriate estimation of estimator variability in the context of causal mediation analysis that employs propensity score‐based weighting. Such an analysis decomposes the total effect of a treatment on the outcome into an indirect effect transmitted through a focal mediator and a direct effect bypassing the mediator. Ratio‐of‐mediator‐probability weighting estimates these causal effects by adjusting for the confounding impact of a large number of pretreatment covariates through propensity score‐based weighting. In step 1, a propensity score model is estimated. In step 2, the causal effects of interest are estimated using weights derived from the prior step's regression coefficient estimates. Statistical inferences obtained from this 2‐step estimation procedure are potentially problematic if the estimated standard errors of the causal effect estimates do not reflect the sampling uncertainty in the estimation of the weights. This study extends to ratio‐of‐mediator‐probability weighting analysis a solution to the 2‐step estimation problem by stacking the score functions from both steps. We derive the asymptotic variance‐covariance matrix for the indirect effect and direct effect 2‐step estimators, provide simulation results, and illustrate with an application study. Our simulation results indicate that the sampling uncertainty in the estimated weights should not be ignored. The standard error estimation using the stacking procedure offers a viable alternative to bootstrap standard error estimation. We discuss broad implications of this approach for causal analysis involving propensity score‐based weighting.

     
    more » « less
  5. Machine learning (ML) methods for causal inference have gained popularity due to their flexibility to predict the outcome model and the propensity score. In this article, we provide a within-group approach for ML-based causal inference methods in order to robustly estimate average treatment effects in multilevel studies when there is cluster-level unmeasured confounding. We focus on one particular ML-based causal inference method based on the targeted maximum likelihood estimation (TMLE) with an ensemble learner called SuperLearner. Through our simulation studies, we observe that training TMLE within groups of similar clusters helps remove bias from cluster-level unmeasured confounders. Also, using within-group propensity scores estimated from fixed effects logistic regression increases the robustness of the proposed within-group TMLE method. Even if the propensity scores are partially misspecified, the within-group TMLE still produces robust ATE estimates due to double robustness with flexible modeling, unlike parametric-based inverse propensity weighting methods. We demonstrate our proposed methods and conduct sensitivity analyses against the number of groups and individual-level unmeasured confounding to evaluate the effect of taking an eighth-grade algebra course on math achievement in the Early Childhood Longitudinal Study.

     
    more » « less