skip to main content


Title: Network experiment designs for inferring causal effects under interference
Current approaches to A/B testing in networks focus on limiting interference, the concern that treatment effects can “spill over” from treatment nodes to control nodes and lead to biased causal effect estimation. In the presence of interference, two main types of causal effects are direct treatment effects and total treatment effects. In this paper, we propose two network experiment designs that increase the accuracy of direct and total effect estimations in network experiments through minimizing interference between treatment and control units. For direct treatment effect estimation, we present a framework that takes advantage of independent sets and assigns treatment and control only to a set of non-adjacent nodes in a graph, in order to disentangle peer effects from direct treatment effect estimation. For total treatment effect estimation, our framework combines weighted graph clustering and cluster matching approaches to jointly minimize interference and selection bias. Through a series of simulated experiments on synthetic and real-world network datasets, we show that our designs significantly increase the accuracy of direct and total treatment effect estimation in network experiments.  more » « less
Award ID(s):
2047899
NSF-PAR ID:
10433402
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Frontiers in Big Data
Volume:
6
ISSN:
2624-909X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Current approaches to A/B testing in networks focus on limiting interference, the concern that treatment effects can ”spill over” from treatment nodes to control nodes and lead to biased causal effect estimation. Prominent methods for network experiment design rely on two-stage randomization, in which sparsely-connected clusters are identified and cluster randomization dictates the node assignment to treatment and control. Here, we show that cluster randomization does not ensure sufficient node randomization and it can lead to selection bias in which treatment and control nodes represent different populations of users. To address this problem, we propose a principled framework for network experiment design which jointly minimizes interference and selection bias. We introduce the concepts of edge spillover probability and cluster matching and demonstrate their importance for designing network A/B testing. Our experiments on a number of real-world datasets show that our proposed framework leads to significantly lower error in causal effect estimation than existing solutions. 
    more » « less
  2. Abstract Network interference, where the outcome of an individual is affected by the treatment assignment of those in their social network, is pervasive in real-world settings. However, it poses a challenge to estimating causal effects. We consider the task of estimating the total treatment effect (TTE), or the difference between the average outcomes of the population when everyone is treated versus when no one is, under network interference. Under a Bernoulli randomized design, we provide an unbiased estimator for the TTE when network interference effects are constrained to low-order interactions among neighbors of an individual. We make no assumptions on the graph other than bounded degree, allowing for well-connected networks that may not be easily clustered. We derive a bound on the variance of our estimator and show in simulated experiments that it performs well compared with standard estimators for the TTE. We also derive a minimax lower bound on the mean squared error of our estimator, which suggests that the difficulty of estimation can be characterized by the degree of interactions in the potential outcomes model. We also prove that our estimator is asymptotically normal under boundedness conditions on the network degree and potential outcomes model. Central to our contribution is a new framework for balancing model flexibility and statistical complexity as captured by this low-order interactions structure. 
    more » « less
  3. Recent work on legislative politics has documented complex patterns of interaction and collaboration through the lens of network analysis. In a largely separate vein of research, the field experiment—with many applications in state legislatures—has emerged as an important approach in establishing causal identification in the study of legislative politics. The stable unit treatment value assumption (SUTVA)—the assumption that a unit’s outcome is unaffected by other units’ treatment statuses—is required in conventional approaches to causal inference with experiments. When SUTVA is violated via networked social interaction, treatment effects spread to control units through the network structure. We review recently developed methods that can be used to account for interference in the analysis of data from field experiments on state legislatures. The methods we review require the researcher to specify a spillover model, according to which legislators influence each other, and specify the network through which spillover occurs. We discuss these and other specification steps in detail. We find mixed evidence for spillover effects in data from two previously published field experiments. Our replication analyses illustrate how researchers can use recently developed methods to test for interference effects, and support the case for considering interference effects in experiments on state legislatures.

     
    more » « less
  4. Randomized experiments are widely used to estimate causal effects across many domains. However, classical causal inference approaches rely on independence assumptions that are violated by network interference, when the treatment of one individual influences the outcomes of others. All existing approaches require at least approximate knowledge of the network, which may be unavailable or costly to collect. We consider the task of estimating the total treatment effect (TTE), the average difference between the outcomes when the whole population is treated versus when the whole population is untreated. By leveraging a staggered rollout design, in which treatment is incrementally given to random subsets of individuals, we derive unbiased estimators for TTE that do not rely on any prior structural knowledge of the network, as long as the network interference effects are constrained to low-degree interactions among neighbors of an individual. We derive bounds on the variance of the estimators, and we show in experiments that our estimator performs well against baselines on simulated data. Central to our theoretical contribution is a connection between staggered rollout observations and polynomial extrapolation. 
    more » « less
  5. null (Ed.)
    We study fairness in supervised few-shot meta-learning models that are sensitive to discrimination (or bias) in historical data. A machine learning model trained based on biased data tends to make unfair predictions for users from minority groups. Although this problem has been studied before, existing methods mainly aim to detect and control the dependency effect of the protected variables (e.g. race, gender) on target prediction based on a large amount of training data. These approaches carry two major drawbacks that (1) lacking showing a global cause-effect visualization for all variables; (2) lacking generalization of both accuracy and fairness to unseen tasks. In this work, we first discover discrimination from data using a causal Bayesian knowledge graph which not only demonstrates the dependency of the protected variable on target but also indicates causal effects between all variables. Next, we develop a novel algorithm based on risk difference in order to quantify the discriminatory influence for each protected variable in the graph. Furthermore, to protect prediction from unfairness, a the fast-adapted bias-control approach in meta-learning is proposed, which efficiently mitigates statistical disparity for each task and it thus ensures independence of protected attributes on predictions based on biased and few-shot data samples. Distinct from existing meta-learning models, group unfairness of tasks are efficiently reduced by leveraging the mean difference between (un)protected groups for regression problems. Through extensive experiments on both synthetic and real-world data sets, we demonstrate that our proposed unfairness discovery and prevention approaches efficiently detect discrimination and mitigate biases on model output as well as generalize both accuracy and fairness to unseen tasks with a limited amount of training samples. 
    more » « less