NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Causal What-If and How-To Analysis Using HYPER

Fangzhu Shen; Kayvon Heravi; Oscar Gomez; Sainyam Galhotra; Amir Gilad; Sudeepa Roy; Babak Salimi (January 2023, IEEE International Conference on Data Engineering (ICDE 2023), Demonstration Track)

Full Text Available
HypeR: Hypothetical Reasoning With What-If and How-To Queries Using a Probabilistic Causal Approach

https://doi.org/10.1145/3514221.3526149

Galhotra, Sainyam; Gilad, Amir; Roy, Sudeepa; Salimi, Babak (January 2022, SIGMOD'22: International Conference on Management of Data)

Full Text Available
FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference

Wang, Tianyu; Morucci, Marco; Awan, M. Usaid; Liu, Yameng; Roy, Sudeepa; Rudin, Cynthia; Volfovsky, Alexander (January 2021, Journal of machine learning research)
null (Ed.)
A classical problem in causal inference is that of matching, where treatment units need to be matched to control units based on covariate information. In this work, we propose a method that computes high quality almost-exact matches for high-dimensional categorical datasets. This method, called FLAME (Fast Large-scale Almost Matching Exactly), learns a distance metric for matching using a hold-out training data set. In order to perform matching efficiently for large datasets, FLAME leverages techniques that are natural for query processing in the area of database management, and two implementations of FLAME are provided: the first uses SQL queries and the second uses bit-vector techniques. The algorithm starts by constructing matches of the highest quality (exact matches on all covariates), and successively eliminates variables in order to match exactly on as many variables as possible, while still maintaining interpretable high-quality matches and balance between treatment and control groups. We leverage these high quality matches to estimate conditional average treatment effects (CATEs). Our experiments show that FLAME scales to huge datasets with millions of observations where existing state-of-the-art methods fail, and that it achieves significantly better performance than other matching methods.
more » « less
Full Text Available
Causal Relational Learning

https://doi.org/10.1145/3318464.3389759

Salimi, Babak; Parikh, Harsh; Kayali, Moe; Getoor, Lise; Roy, Sudeepa; Suciu, Dan (May 2020, SIGMOD)

Causal inference is at the heart of empirical research in natu- ral and social sciences and is critical for scientific discovery and informed decision making. The gold standard in causal inference is performing randomized controlled trials; unfortu- nately these are not always feasible due to ethical, legal, or cost constraints. As an alternative, methodologies for causal inference from observational data have been developed in sta- tistical studies and social sciences. However, existing meth- ods critically rely on restrictive assumptions such as the study population consisting of homogeneous elements that can be represented in a single flat table, where each row is referred to as a unit. In contrast, in many real-world set- tings, the study domain naturally consists of heterogeneous elements with complex relational structure, where the data is naturally represented in multiple related tables. In this paper, we present a formal framework for causal inference from such relational data. We propose a declarative language called CaRL for capturing causal background knowledge and assumptions, and specifying causal queries using simple Datalog-like rules. CaRL provides a foundation for infer- ring causality and reasoning about the effect of complex interventions in relational domains. We present an extensive experimental evaluation on real relational data to illustrate the applicability of CaRL in social sciences and healthcare.
more » « less
Full Text Available
Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference

Awan, Usaid; Morucci, Marco; Orlandi, Vittorio; Roy, Sudeepa; Rudin, Cynthia; Volfovsky, Alexander (January 2020, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics)
null (Ed.)
We propose a matching method that recovers direct treatment effects from randomized experiments where units are connected in an observed network, and units that share edges can potentially influence each others’ outcomes. Traditional treatment effect estimators for randomized experiments are biased and error prone in this setting. Our method matches units almost exactly on counts of unique subgraphs within their neighborhood graphs. The matches that we construct are interpretable and high-quality. Our method can be extended easily to accommodate additional unit-level covariate information. We show empirically that our method performs better than other existing methodologies for this problem, while producing meaningful, interpretable results.
more » « less
Full Text Available
Adaptive Hyper-box Matching for Interpretable Individualized Treatment Effect Estimation

Morucci, Marco; Orlandi, Vittorio; Roy, Sudeepa; Rudin, Cynthia; Volfovsky, Alexander (January 2020, Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI))
null (Ed.)
We propose a matching method for observational data that matches units with others in unit-specific, hyper-box-shaped regions of the covariate space. These regions are large enough that many matches are created for each unit and small enough that the treatment effect is roughly constant throughout. The regions are found as either the solution to a mixed integer program, or using a (fast) approximation algorithm. The result is an interpretable and tailored estimate of the causal effect for each unit.
more » « less
Full Text Available
iQCAR: inter-Query Contention Analyzer for Data Analytics Frameworks

https://doi.org/10.1145/3299869.3319904

Kalmegh, Prajakta; Babu, Shivnath; Roy, Sudeepa (January 2019, ACM SIGMOD International Conference on Management of Data (SIGMOD))

Full Text Available
Interpretable Almost-Exact Matching for Causal Inference

Dieng, Awa; Liu, Yameng; Roy, Sudeepa; Rudin, Cynthia; Volfovsky, Alexander (January 2019, International Conference on Artificial Intelligence and Statistics (AISTATS))

Full Text Available
Interpretable Almost Matching Exactly With Instrumental Variables

Awan, M Usaid; Liu, Yameng; Morucci, Marco; Roy, Sudeepa; Rudin, Cynthia; Volfovsky, Alexander (January 2019, Conference on Uncertainty in Artificial Intelligence (UAI))

Full Text Available
iQCAR: A Demonstration of an Inter-Query Contention Analyzer for Cluster Computing Frameworks

https://doi.org/10.1145/3183713.3193567

Kalmegh, Prajakta; Lundberg, Harrison; Xu, Frederick; Babu, Shivnath; Roy, Sudeepa (January 2018, Proceedings of the 2018 International Conference on Management of Data, {SIGMOD} Conference 2018)

Full Text Available

Search for: All records