 Award ID(s):
 1704352
 NSFPAR ID:
 10098082
 Date Published:
 Journal Name:
 Proceedings of the 36th International Conference on Machine Learning (ICML)
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Generalizing causal effects from a controlled experiment to settings beyond the particular study population is arguably one of the central tasks found in empirical circles. While a proper design and careful execution of the experiment would support, under mild conditions, the validity of inferences about the population in which the experiment was conducted, two challenges make the extrapolation step to different populations somewhat involved, namely, transportability and sampling selection bias. The former is concerned with disparities in the distributions and causal mechanisms between the domain (i.e., settings, population, environment) where the experiment is conducted and where the inferences are intended; the latter with distortions in the sample’s proportions due to preferential selection of units into the study. In this paper, we investigate the assumptions and machinery necessary for using \emph{covariate adjustment} to correct for the biases generated by both of these problems, and generalize experimental data to infer causal effects in a new domain. We derive complete graphical conditions to determine if a set of covariates is admissible for adjustment in this new setting. Building on the graphical characterization, we develop an efficient algorithm that enumerates all possible admissible sets with polytime delay guarantee; this can be useful for when some variables are preferred over the others due to different costs or amenability to measurement.more » « less

Selection and confounding biases are the two most common impediments to the applicability of causal inference methods in largescale settings. We generalize the notion of backdoor adjustment to account for both biases and leverage external data that may be available without selection bias (e.g., data from census). We introduce the notion of adjustment pair and present complete graphical conditions for identifying causal effects by adjustment. We further design an algorithm for listing all admissible adjustment pairs in polynomial delay, which is useful for researchers interested in evaluating certain properties of some admissible pairs but not all (common properties include cost, variance, and feasibility to measure). Finally, we describe a statistical estimation procedure that can be performed once a set is known to be admissible, which entails different challenges in terms of finite samples.more » « less

Selection and confounding biases are the two most common impediments to the applicability of causal inference methods in largescale settings. We generalize the notion of backdoor adjustment to account for both biases and leverage external data that may be available without selection bias (e.g., data from census). We introduce the notion of adjustment pair and present complete graphical conditions for identifying causal effects by adjustment. We further design an algorithm for listing all admissible adjustment pairs in polynomial delay, which is useful for researchers interested in evaluating certain properties of some admissible pairs but not all (common properties include cost, variance, and feasibility to measure). Finally, we describe a statistical estimation procedure that can be performed once a set is known to be admissible, which entails different challenges in terms of finite samples.more » « less

Identification theory for causal effects in causal models associated with hidden variable directed acyclic graphs (DAGs) is well studied. However, the corresponding algorithms are underused due to the complexity of estimating the identifying functionals they output. In this work, we bridge the gap between identification and estimation of populationlevel causal effects involving a single treatment and a single outcome. We derive influence function based estimators that exhibit double robustness for the identified effects in a large class of hidden variable DAGs where the treatment satisfies a simple graphical criterion; this class includes models yielding the adjustment and frontdoor functionals as special cases. We also provide necessary and sufficient conditions under which the statistical model of a hidden variable DAG is nonparametrically saturated and implies no equality constraints on the observed data distribution. Further, we derive an important class of hidden variable DAGs that imply observed data distributions observationally equivalent (up to equality constraints) to fully observed DAGs. In these classes of DAGs, we derive estimators that achieve the semiparametric efficiency bounds for the target of interest where the treatment satisfies our graphical criterion. Finally, we provide a sound and complete identification algorithm that directly yields a weight based estimation strategy for any identifiable effect in hidden variable causal models.more » « less

Abstract The findings of the ARRIVE trial (A Randomized Trial of Induction Versus Expectant Management) were recently published. This multisite randomized trial was designed to provide evidence regarding whether labor induction or expectant management is associated with increased adverse perinatal outcomes and risk of cesarean birth among healthy nulliparous women at term. The trial reported that the primary outcome, a composite of adverse neonatal outcomes, was not significantly different between the 2 groups; the principal secondary outcome, cesarean birth, was significantly more common among women whose pregnancy was expectantly managed than among women whose labor was induced at 39 weeks. These results have the potential to change existing practice. Several aspects of the study design may influence its potential internal and external validity and should be considered in order to make sound causal inferences from this trial, which will in turn affect how its findings are translated to practice. Although chance and confounding are of minimal concern, given the sample size and randomization used in the study, selection bias may be a concern. Studies are vulnerable to selection bias when the sample population differs from eligible nonparticipants, including in randomized controlled trials. External validity is defined as the extent to which the study population and setting are representative of the larger source population the study intends to represent. External validity may be limited given the characteristics of the women enrolled in the ARRIVE trial and the practice settings where the study was conducted. This brief report provides concrete suggestions for further analyses that could help solidify conclusions from the trial, and for further research questions that will continue advancement toward answering this complex question of how best to manage labor and birth decisions at full term among low‐risk women.