- Editors:
- Scholkopf, Bernhard; Uhler, Caroline; Zhang, Kun
- Award ID(s):
- 1945266
- Publication Date:
- NSF-PAR ID:
- 10334958
- Journal Name:
- First Conference on Causal Learning and Reasoning, PMLR
- Volume:
- 140
- Page Range or eLocation-ID:
- 1-35
- Sponsoring Org:
- National Science Foundation
More Like this
-
Windecker, Saras (Ed.)1. The ecological and environmental science communities have embraced machine learning (ML) for empirical modelling and prediction. However, going beyond prediction to draw insights into underlying functional relationships between response variables and environmental ‘drivers’ is less straightforward. Deriving ecological insights from fitted ML models requires techniques to extract the ‘learning’ hidden in the ML models. 2. We revisit the theoretical background and effectiveness of four approaches for deriving insights from ML: ranking independent variable importance (Gini importance, GI; permutation importance, PI; split importance, SI; and conditional permutation importance, CPI), and two approaches for inference of bivariate functional relationships (partial dependence plots, PDP; and accumulated local effect plots, ALE). We also explore the use of a surrogate model for visualization and interpretation of complex multi-variate relationships between response variables and environmental drivers. We examine the challenges and opportunities for extracting ecological insights with these interpretation approaches. Specifically, we aim to improve interpretation of ML models by investigating how effectiveness relates to (a) interpretation algorithm, (b) sample size and (c) the presence of spurious explanatory variables. 3. We base the analysis on simulations with known underlying functional relationships between response and predictor variables, with added white noise and the presence of correlatedmore »
-
Tests of conditional independence (CI) of ran- dom variables play an important role in ma- chine learning and causal inference. Of partic- ular interest are kernel-based CI tests which allow us to test for independence among ran- dom variables with complex distribution func- tions. The efficacy of a CI test is measured in terms of its power and its calibratedness. We show that the Kernel CI Permutation Test (KCIPT) suffers from a loss of calibratedness as its power is increased by increasing the number of bootstraps. To address this limita- tion, we propose a novel CI test, called Self- Discrepancy Conditional Independence Test (SDCIT). SDCIT uses a test statistic that is a modified unbiased estimate of maximum mean discrepancy (MMD), the largest difference in the means of features of the given sample and its permuted counterpart in the kernel-induced Hilbert space. We present results of experi- ments that demonstrate SDCIT is, relative to the other methods: (i) competitive in terms of its power and calibratedness, outperforming other methods when the number of condition- ing variables is large; (ii) more robust with re- spect to the choice of the kernel function; and (iii) competitive in run time.
-
Abstract Randomized experiments are the gold standard for causal inference and enable unbiased estimation of treatment effects. Regression adjustment provides a convenient way to incorporate covariate information for additional efficiency. This article provides a unified account of its utility for improving estimation efficiency in multiarmed experiments. We start with the commonly used additive and fully interacted models for regression adjustment in estimating average treatment effects (ATE), and clarify the trade-offs between the resulting ordinary least squares (OLS) estimators in terms of finite sample performance and asymptotic efficiency. We then move on to regression adjustment based on restricted least squares (RLS), and establish for the first time its properties for inferring ATE from the design-based perspective. The resulting inference has multiple guarantees. First, it is asymptotically efficient when the restriction is correctly specified. Second, it remains consistent as long as the restriction on the coefficients of the treatment indicators, if any, is correctly specified and separate from that on the coefficients of the treatment-covariate interactions. Third, it can have better finite sample performance than the unrestricted counterpart even when the restriction is moderately misspecified. It is thus our recommendation when the OLS fit of the fully interacted regression risks large finitemore »
-
We propose a general method for constructing confidence sets and hypothesis tests that have finite-sample guarantees without regularity conditions. We refer to such procedures as “universal.” The method is very simple and is based on a modified version of the usual likelihood-ratio statistic that we call “the split likelihood-ratio test” (split LRT) statistic. The (limiting) null distribution of the classical likelihood-ratio statistic is often intractable when used to test composite null hypotheses in irregular statistical models. Our method is especially appealing for statistical inference in these complex setups. The method we suggest works for any parametric model and also for some nonparametric models, as long as computing a maximum-likelihood estimator (MLE) is feasible under the null. Canonical examples arise in mixture modeling and shape-constrained inference, for which constructing tests and confidence sets has been notoriously difficult. We also develop various extensions of our basic methods. We show that in settings when computing the MLE is hard, for the purpose of constructing valid tests and intervals, it is sufficient to upper bound the maximum likelihood. We investigate some conditions under which our methods yield valid inferences under model misspecification. Further, the split LRT can be used with profile likelihoods to dealmore »
-
Summary In many biomedical studies, we are interested in comparing treatment effects with an inherent ordering. We propose a quadratic score test (QST) based on a quadratic inference function for detecting an order in treatment effects for correlated data. The quadratic inference function is similar to the negative of a log-likelihood, and it provides test statistics in the spirit of a χ2-test for testing nested hypotheses as well as for assessing the goodness of fit of model assumptions. Under the null hypothesis of no order restriction, it is shown that the QST statistic has a Wald-type asymptotic representation and that the asymptotic distribution of the QST statistic is a weighted χ2-distribution. Furthermore, an asymptotic distribution of the QST statistic under an arbitrary convex cone alternative is provided. The performance of the QST is investigated through Monte Carlo simulation experiments. Analysis of the polyposis data demonstrates that the QST outperforms the Wald test when data are highly correlated with a small sample size and there is a significant amount of missing data with a small number of clusters. The proposed test statistic accommodates both time-dependent and time-independent covariates in a model.