skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Future Strikes Back: Using Future Treatments to Detect and Reduce Hidden Bias
Conventional advice discourages controlling for postoutcome variables in regression analysis. By contrast, we show that controlling for commonly available postoutcome (i.e., future) values of the treatment variable can help detect, reduce, and even remove omitted variable bias (unobserved confounding). The premise is that the same unobserved confounder that affects treatment also affects the future value of the treatment. Future treatments thus proxy for the unmeasured confounder, and researchers can exploit these proxy measures productively. We establish several new results: Regarding a commonly assumed data-generating process involving future treatments, we (1) introduce a simple new approach and show that it strictly reduces bias, (2) elaborate on existing approaches and show that they can increase bias, (3) assess the relative merits of alternative approaches, and (4) analyze true state dependence and selection as key challenges. (5) Importantly, we also introduce a new nonparametric test that uses future treatments to detect hidden bias even when future-treatment estimation fails to reduce bias. We illustrate these results empirically with an analysis of the effect of parental income on children’s educational attainment.  more » « less
Award ID(s):
2042875
PAR ID:
10348443
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Sociological Methods & Research
Volume:
51
Issue:
3
ISSN:
0049-1241
Page Range / eLocation ID:
1014 to 1051
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary Unobserved confounding presents a major threat to causal inference in observational studies. Recently, several authors have suggested that this problem could be overcome in a shared confounding setting where multiple treatments are independent given a common latent confounder. It has been shown that under a linear Gaussian model for the treatments, the causal effect is not identifiable without parametric assumptions on the outcome model. In this note, we show that the causal effect is indeed identifiable if we assume a general binary choice model for the outcome with a non-probit link. Our identification approach is based on the incongruence between Gaussianity of the treatments and latent confounder and non-Gaussianity of a latent outcome variable. We further develop a two-step likelihood-based estimation procedure. 
    more » « less
  2. Abstract Weighting methods are popular tools for estimating causal effects, and assessing their robustness under unobserved confounding is important in practice. Current approaches to sensitivity analyses rely on bounding a worst-case error from omitting a confounder. In this paper, we introduce a new sensitivity model called the variance-based sensitivity model, which instead bounds the distributional differences that arise in the weights from omitting a confounder. The variance-based sensitivity model can be parameterized by an R2 parameter that is both standardized and bounded. We demonstrate, both empirically and theoretically, that the variance-based sensitivity model provides improvements on the stability of the sensitivity analysis procedure over existing methods. We show that by moving away from worst-case bounds, we are able to obtain more interpretable and informative bounds. We illustrate our proposed approach on a study examining blood mercury levels using the National Health and Nutrition Examination Survey. 
    more » « less
  3. Causal inference from observational data has attracted considerable attention among researchers. One main obstacle is the handling of confounders. As direct measurement of confounders may not be feasible, recent methods seek to address the confounding bias via proxy variables, i.e., covariates postulated to be conducive to the inference of latent confounders. However, the selected proxies may scramble both confounders and post-treatment variables in practice, which risks biasing the estimation by controlling for variables affected by the treatment. In this paper, we systematically investigate the bias due to latent post-treatment variables, i.e., latent post-treatment bias, in causal effect estimation. Specifically, we first derive the bias when selected proxies scramble both latent confounders and post-treatment variables, which we demonstrate can be arbitrarily bad. We then propose a Confounder-identifiable VAE (CiVAE) to address the bias. Based on a mild assumption that the prior of latent variables that generate the proxy belongs to a general exponential family with at least one invertible sufficient statistic in the factorized part, CiVAE individually identifies latent confounders and latent post-treatment variables up to bijective transformations. We then prove that with individual identification, the intractable disentanglement problem of latent confounders and post-treatment variables can be transformed into a tractable independence test problem despite arbitrary dependence may exist among them. Finally, we prove that the true causal effects can be unbiasedly estimated with transformed confounders inferred by CiVAE. Experiments on both simulated and real-world datasets demonstrate significantly improved robustness of CiVAE. 
    more » « less
  4. We study regressions with multiple treatments and a set of controls that is flexible enough to purge omitted variable bias. We show these regressions generally fail to estimate convex averages of heterogeneous treatment effects—instead, estimates of each treatment’s effect are contaminated by nonconvex averages of the effects of other treatments. We discuss three estimation approaches that avoid such contamination bias, including the targeting of easiest-to-estimate weighted average effects. A reanalysis of nine empirical applications finds economically and statistically meaningful contamination bias in observational studies; contamination bias in experimental studies is more limited due to smaller variability in propensity scores. 
    more » « less
  5. Networked observational data presents new opportunities for learning individual causal effects, which plays an indispensable role in decision making. Such data poses the challenge of confounding bias. Previous work presents two desiderata to handle confounding bias. On the treatment group level, we aim to balance the distributions of confounder representations. On the individual level, it is desirable to capture patterns of hidden confounders that predict treatment assignments. Existing methods show the potential of utilizing network information to handle confounding bias, but they only try to satisfy one of the two desiderata. This is because the two desiderata seem to contradict each other. When the two distributions of confounder representations are highly overlapped, then we confront the undiscriminating problem between the treated and the controlled. In this work, we formulate the two desiderata as a minimax game. We propose IGNITE that learns representations of confounders from networked observational data, which is trained by a minimax game to achieve the two desiderata. Experiments verify the efficacy of IGNITE on two datasets under various settings. 
    more » « less