skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on April 21, 2026

Title: Using Multiple Outcomes to Improve the Synthetic Control Method
Abstract When there are multiple outcome series of interest, Synthetic Control analyses typically proceed by estimating separate weights for each outcome. In this paper, we instead propose estimating a common set of weights across outcomes, by balancing either a vector of all outcomes or an index or average of them. Under a low-rank factor model, we show that these approaches lead to lower bias bounds than separate weights, and that averaging leads to further gains when the number of outcomes grows. We illustrate this via a re-analysis of the impact of the Flint water crisis on educational outcomes.  more » « less
Award ID(s):
2243822
PAR ID:
10646734
Author(s) / Creator(s):
; ;
Publisher / Repository:
Harvard and MIT
Date Published:
Journal Name:
Review of Economics and Statistics
ISSN:
0034-6535
Page Range / eLocation ID:
1 to 29
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract We provide a novel characterization of augmented balancing weights, also known as automatic debiased machine learning. These popular doubly robust estimators combine outcome modelling with balancing weights—weights that achieve covariate balance directly instead of estimating and inverting the propensity score. When the outcome and weighting models are both linear in some (possibly infinite) basis, we show that the augmented estimator is equivalent to a single linear model with coefficients that combine those of the original outcome model with those from unpenalized ordinary least-squares (OLS). Under certain choices of regularization parameters, the augmented estimator in fact collapses to the OLS estimator alone. We then extend these results to specific outcome and weighting models. We first show that the augmented estimator that uses (kernel) ridge regression for both outcome and weighting models is equivalent to a single, undersmoothed (kernel) ridge regression—implying a novel analysis of undersmoothing. When the weighting model is instead lasso-penalized, we demonstrate a familiar ‘double selection’ property. Our framework opens the black box on this increasingly popular class of estimators, bridges the gap between existing results on the semiparametric efficiency of undersmoothed and doubly robust estimators, and provides new insights into the performance of augmented balancing weights. 
    more » « less
  2. Summary The problem of estimating the average treatment effects is important when evaluating the effectiveness of medical treatments or social intervention policies. Most of the existing methods for estimating the average treatment effect rely on some parametric assumptions about the propensity score model or the outcome regression model one way or the other. In reality, both models are prone to misspecification, which can have undue influence on the estimated average treatment effect. We propose an alternative robust approach to estimating the average treatment effect based on observational data in the challenging situation when neither a plausible parametric outcome model nor a reliable parametric propensity score model is available. Our estimator can be considered as a robust extension of the popular class of propensity score weighted estimators. This approach has the advantage of being robust, flexible, data adaptive, and it can handle many covariates simultaneously. Adopting a dimension reduction approach, we estimate the propensity score weights semiparametrically by using a non-parametric link function to relate the treatment assignment indicator to a low-dimensional structure of the covariates which are formed typically by several linear combinations of the covariates. We develop a class of consistent estimators for the average treatment effect and study their theoretical properties. We demonstrate the robust performance of the estimators on simulated data and a real data example of investigating the effect of maternal smoking on babies’ birth weight. 
    more » « less
  3. Abstract Although the existing causal inference literature focuses on the forward-looking perspective by estimating effects of causes, the backward-looking perspective can provide insights into causes of effects. In backward-looking causal inference, the probability of necessity measures the probability that a certain event is caused by the treatment given the observed treatment and outcome. Most existing results focus on binary outcomes. Motivated by applications with ordinal outcomes, we propose a general definition of the probability of necessity. However, identifying the probability of necessity is challenging because it involves the joint distribution of the potential outcomes. We propose a novel assumption of monotonic incremental treatment effect to identify the probability of necessity with ordinal outcomes. We also discuss the testable implications of this key identification assumption. When it fails, we derive explicit formulas of the sharp large-sample bounds on the probability of necessity. 
    more » « less
  4. Abstract Network interference, where the outcome of an individual is affected by the treatment assignment of those in their social network, is pervasive in real-world settings. However, it poses a challenge to estimating causal effects. We consider the task of estimating the total treatment effect (TTE), or the difference between the average outcomes of the population when everyone is treated versus when no one is, under network interference. Under a Bernoulli randomized design, we provide an unbiased estimator for the TTE when network interference effects are constrained to low-order interactions among neighbors of an individual. We make no assumptions on the graph other than bounded degree, allowing for well-connected networks that may not be easily clustered. We derive a bound on the variance of our estimator and show in simulated experiments that it performs well compared with standard estimators for the TTE. We also derive a minimax lower bound on the mean squared error of our estimator, which suggests that the difficulty of estimation can be characterized by the degree of interactions in the potential outcomes model. We also prove that our estimator is asymptotically normal under boundedness conditions on the network degree and potential outcomes model. Central to our contribution is a new framework for balancing model flexibility and statistical complexity as captured by this low-order interactions structure. 
    more » « less
  5. Summary A fundamental challenge in semi-supervised learning lies in the observed data’s disproportional size when compared with the size of the data collected with missing outcomes. An implicit understanding is that the dataset with missing outcomes, being significantly larger, ought to improve estimation and inference. However, it is unclear to what extent this is correct. We illustrate one clear benefit: root-$$n$$ inference of the outcome’s mean is possible while only requiring a consistent estimation of the outcome, possibly at a rate slower than root $$n$$. This is achieved by a novel $$k$$-fold, cross-fitted, double robust estimator. We discuss both linear and nonlinear outcomes. Such an estimator is particularly suited for models that naturally do not admit root-$$n$$ consistency, such as high-dimensional, nonparametric or semiparametric models. We apply our methods to estimating heterogeneous treatment effects. 
    more » « less