skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Causal imputation via synthetic interventions”, Causal Learning and Reasoning
Consider the problem of determining the effect of a compound on a specific cell type. To answer this question, researchers traditionally need to run an experiment applying the drug of interest to that cell type. This approach is not scalable: given a large number of different actions (compounds) and a large number of different contexts (cell types), it is infeasible to run an experiment for every action-context pair. In such cases, one would ideally like to predict the outcome for every pair while only needing outcome data for a small _subset_ of pairs. This task, which we label "causal imputation", is a generalization of the causal transportability problem. To address this challenge, we extend the recently introduced _synthetic interventions_ (SI) estimator to handle more general data sparsity patterns. We prove that, under a latent factor model, our estimator provides valid estimates for the causal imputation task. We motivate this model by establishing a connection to the linear structural causal model literature. Finally, we consider the prominent CMAP dataset in predicting the effects of compounds on gene expression across cell types. We find that our estimator outperforms standard baselines, thus confirming its utility in biological applications.  more » « less
Award ID(s):
1651995
PAR ID:
10339070
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Volume:
177
ISSN:
2640-3498
Page Range / eLocation ID:
688-711
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary It is important to draw causal inference from observational studies, but this becomes challenging if the confounders have missing values. Generally, causal effects are not identifiable if the confounders are missing not at random. In this article we propose a novel framework for nonparametric identification of causal effects with confounders subject to an outcome-independent missingness, which means that the missing data mechanism is independent of the outcome, given the treatment and possibly missing confounders. We then propose a nonparametric two-stage least squares estimator and a parametric estimator for causal effects. 
    more » « less
  2. Abstract Investigating the causal relationship between exposure and time-to-event outcome is an important topic in biomedical research. Previous literature has discussed the potential issues of using hazard ratio (HR) as the marginal causal effect measure due to noncollapsibility. In this article, we advocate using restricted mean survival time (RMST) difference as a marginal causal effect measure, which is collapsible and has a simple interpretation as the difference of area under survival curves over a certain time horizon. To address both measured and unmeasured confounding, a matched design with sensitivity analysis is proposed. Matching is used to pair similar treated and untreated subjects together, which is generally more robust than outcome modeling due to potential misspecifications. Our propensity score matched RMST difference estimator is shown to be asymptotically unbiased, and the corresponding variance estimator is calculated by accounting for the correlation due to matching. Simulation studies also demonstrate that our method has adequate empirical performance and outperforms several competing methods used in practice. To assess the impact of unmeasured confounding, we develop a sensitivity analysis strategy by adapting the E -value approach to matched data. We apply the proposed method to the Atherosclerosis Risk in Communities Study (ARIC) to examine the causal effect of smoking on stroke-free survival. 
    more » « less
  3. This study investigates appropriate estimation of estimator variability in the context of causal mediation analysis that employs propensity score‐based weighting. Such an analysis decomposes the total effect of a treatment on the outcome into an indirect effect transmitted through a focal mediator and a direct effect bypassing the mediator. Ratio‐of‐mediator‐probability weighting estimates these causal effects by adjusting for the confounding impact of a large number of pretreatment covariates through propensity score‐based weighting. In step 1, a propensity score model is estimated. In step 2, the causal effects of interest are estimated using weights derived from the prior step's regression coefficient estimates. Statistical inferences obtained from this 2‐step estimation procedure are potentially problematic if the estimated standard errors of the causal effect estimates do not reflect the sampling uncertainty in the estimation of the weights. This study extends to ratio‐of‐mediator‐probability weighting analysis a solution to the 2‐step estimation problem by stacking the score functions from both steps. We derive the asymptotic variance‐covariance matrix for the indirect effect and direct effect 2‐step estimators, provide simulation results, and illustrate with an application study. Our simulation results indicate that the sampling uncertainty in the estimated weights should not be ignored. The standard error estimation using the stacking procedure offers a viable alternative to bootstrap standard error estimation. We discuss broad implications of this approach for causal analysis involving propensity score‐based weighting. 
    more » « less
  4. Predicting how different interventions will causally affect a specific individual is important in a variety of domains such as personalized medicine, public policy, and online marketing. There are a large number of methods to predict the effect of an existing intervention based on historical data from individuals who received it. However, in many settings it is important to predict the effects of novel interventions (e.g., a newly invented drug), which these methods do not address. Here, we consider zero-shot causal learning: predicting the personalized effects of a novel intervention. We propose CaML, a causal meta-learning framework which formulates the personalized prediction of each intervention’s effect as a task. CaML trains a single meta-model across thousands of tasks, each constructed by sampling an intervention, its recipients, and its nonrecipients. By leveraging both intervention information (e.g., a drug’s attributes) and individual features (e.g., a patient’s history), CaML is able to predict the personalized effects of novel interventions that do not exist at the time of training. Experimental results on real world datasets in large-scale medical claims and cell-line perturbations demonstrate the effectiveness of our approach. Most strikingly, CaML’s zero-shot predictions outperform even strong baselines trained directly on data from the test interventions. 
    more » « less
  5. Consider a policymaker who wants to decide which intervention to perform in order to change a currently undesirable situation. The policymaker has at her disposal a team of experts, each with their own understanding of the causal dependencies between different factors contributing to the outcome. The policymaker has varying degrees of confidence in the experts' opinions. She wants to combine their opinions in order to decide on the most effective intervention. We formally define the notion of an effective intervention, and then consider how experts' causal judgments can be combined in order to determine the most effective intervention. We define a notion of two causal models being \emph{compatible}, and show how compatible causal models can be combined. We then use it as the basis for combining experts causal judgments. We illustrate our approach on a number of real-life examples. 
    more » « less