To infer the treatment effect for a single treated unit using panel data, synthetic control (SC) methods construct a linear combination of control units’ outcomes that mimics the treated unit’s pre-treatment outcome trajectory. This linear combination is subsequently used to impute the counterfactual outcomes of the treated unit had it not been treated in the post-treatment period, and used to estimate the treatment effect. Existing SC methods rely on correctly modeling certain aspects of the counterfactual outcome generating mechanism and may require near-perfect matching of the pre-treatment trajectory. Inspired by proximal causal inference, we obtain two novel nonparametric identifying formulas for the average treatment effect for the treated unit: one is based on weighting, and the other combines models for the counterfactual outcome and the weighting function. We introduce the concept of covariate shift to SCs to obtain these identification results conditional on the treatment assignment. We also develop two treatment effect estimators based on these two formulas and generalized method of moments. One new estimator is doubly robust: it is consistent and asymptotically normal if at least one of the outcome and weighting models is correctly specified. We demonstrate the performance of the methods via simulations and apply them to evaluate the effectiveness of a pneumococcal conjugate vaccine on the risk of all-cause pneumonia in Brazil.
Network Synthetic Interventions: A Causal Framework for Panel Data Under Network Interference
We propose a generalization of the synthetic controls and synthetic interventions methodology
to incorporate network interference. We consider the estimation of unit-specific potential
outcomes from panel data in the presence of spillover across units and unobserved confounding.
Key to our approach is a novel latent factor model that takes into account network interference
and generalizes the factor models typically used in panel data settings. We propose an estimator,
Network Synthetic Interventions (NSI), and show that it consistently estimates the mean
outcomes for a unit under an arbitrary set of counterfactual treatments for the network. We
further establish that the estimator is asymptotically normal. We furnish two validity tests for
whether the NSI estimator reliably generalizes to produce accurate counterfactual estimates. We
provide a novel graph-based experiment design that guarantees the NSI estimator produces accurate
counterfactual estimates, and also analyze the sample complexity of the proposed design.
We conclude with simulations that corroborate our theoretical findings.
more »
« less
- Award ID(s):
- 2022448
- PAR ID:
- 10525305
- Publisher / Repository:
- https://arxiv.org/pdf/2210.11355
- Date Published:
- Format(s):
- Medium: X
- Institution:
- Massachusetts Institute of Technology
- Sponsoring Org:
- National Science Foundation
More Like this
-
ABSTRACT -
Multi-agent dynamical systems refer to scenarios where multiple units (aka agents) interact with each other and evolve collectively over time. For instance, people’s health conditions are mutually influenced. Receiving vaccinations not only strengthens the longterm health status of one unit but also provides protection for those in their immediate surroundings. To make informed decisions in multi-agent dynamical systems, such as determining the optimal vaccine distribution plan, it is essential for decision-makers to estimate the continuous-time counterfactual outcomes. However, existing studies of causal inference over time rely on the assumption that units are mutually independent, which is not valid for multi-agent dynamical systems. In this paper, we aim to bridge this gap and study how to estimate counterfactual outcomes in multi-agent dynamical systems. Causal inference in a multi-agent dynamical system has unique challenges: 1) Confounders are timevarying and are present in both individual unit covariates and those of other units; 2) Units are affected by not only their own but also others’ treatments; 3) The treatments are naturally dynamic, such as receiving vaccines and boosters in a seasonal manner. To this end, we model a multi-agent dynamical system as a graph and propose a novel model called CF-GODE (CounterFactual Graph Ordinary Differential Equations). CF-GODE is a causal model that estimates continuous-time counterfactual outcomes in the presence of inter-dependencies between units. To facilitate continuous-time estimation,we propose Treatment-Induced GraphODE, a novel ordinary differential equation based on graph neural networks (GNNs), which can incorporate dynamical treatments as additional inputs to predict potential outcomes over time. To remove confounding bias, we propose two domain adversarial learning based objectives that learn balanced continuous representation trajectories, which are not predictive of treatments and interference. We further provide theoretical justification to prove their effectiveness. Experiments on two semi-synthetic datasets confirm that CF-GODE outperforms baselines on counterfactual estimation. We also provide extensive analyses to understand how our model works.more » « less
-
Abstract Standard estimators of the global average treatment effect can be biased in the presence of interference. This paper proposes regression adjustment estimators for removing bias due to interference in Bernoulli randomized experiments. We use a fitted model to predict the counterfactual outcomes of global control and global treatment. Our work differs from standard regression adjustments in that the adjustment variables are constructed from functions of the treatment assignment vector, and that we allow the researcher to use a collection of any functions correlated with the response, turning the problem of detecting interference into a feature engineering problem. We characterize the distribution of the proposed estimator in a linear model setting and connect the results to the standard theory of regression adjustments under SUTVA. We then propose an estimator that allows for flexible machine learning estimators to be used for fitting a nonlinear interference functional form. We propose conducting statistical inference via bootstrap and resampling methods, which allow us to sidestep the complicated dependences implied by interference and instead rely on empirical covariance structures. Such variance estimation relies on an exogeneity assumption akin to the standard unconfoundedness assumption invoked in observational studies. In simulation experiments, our methods are better at debiasing estimates than existing inverse propensity weighted estimators based on neighborhood exposure modeling. We use our method to reanalyze an experiment concerning weather insurance adoption conducted on a collection of villages in rural China.more » « less
-
Randomized experiments are widely used to estimate causal effects across many domains. However, classical causal inference approaches rely on independence assumptions that are violated by network interference, when the treatment of one individual influences the outcomes of others. All existing approaches require at least approximate knowledge of the network, which may be unavailable or costly to collect. We consider the task of estimating the total treatment effect (TTE), the average difference between the outcomes when the whole population is treated versus when the whole population is untreated. By leveraging a staggered rollout design, in which treatment is incrementally given to random subsets of individuals, we derive unbiased estimators for TTE that do not rely on any prior structural knowledge of the network, as long as the network interference effects are constrained to low-degree interactions among neighbors of an individual. We derive bounds on the variance of the estimators, and we show in experiments that our estimator performs well against baselines on simulated data. Central to our theoretical contribution is a connection between staggered rollout observations and polynomial extrapolation.more » « less
-
Abstract This paper proposes a method for estimating the effect of a policy intervention on an outcome over time. We train recurrent neural networks (RNNs) on the history of control unit outcomes to learn a useful representation for predicting future outcomes. The learned representation of control units is then applied to the treated units for predicting counterfactual outcomes. RNNs are specifically structured to exploit temporal dependencies in panel data and are able to learn negative and non-linear interactions between control unit outcomes. We apply the method to the problem of estimating the long-run impact of US homestead policy on public school spending.