Many causal and structural effects depend on regressions. Examples include policy effects, average derivatives, regression decompositions, average treatment effects, causal mediation, and parameters of economic structural models. The regressions may be high‐dimensional, making machine learning useful. Plugging machine learners into identifying equations can lead to poor inference due to bias from regularization and/or model selection. This paper gives automatic debiasing for linear and nonlinear functions of regressions. The debiasing is automatic in using Lasso and the function of interest without the full form of the bias correction. The debiasing can be applied to any regression learner, including neural nets, random forests, Lasso, boosting, and other high‐dimensional methods. In addition to providing the bias correction, we give standard errors that are robust to misspecification, convergence rates for the bias correction, and primitive conditions for asymptotic inference for estimators of a variety of estimators of structural and causal effects. The automatic debiased machine learning is used to estimate the average treatment effect on the treated for the NSW job training data and to estimate demand elasticities from Nielsen scanner data while allowing preferences to be correlated with prices and income.
more »
« less
L1‐Penalized Multinomial Regression: Estimation, Inference, and Prediction, With an Application to Risk Factor Identification for Different Dementia Subtypes
High‐dimensional multinomial regression models are very useful in practice but have received less research attention than logistic regression models, especially from the perspective of statistical inference. In this work, we analyze the estimation and prediction error of the contrast‐based ‐penalized multinomial regression model and extend the debiasing method to the multinomial case, providing a valid confidence interval for each coefficient and value of the individual hypothesis test. We also examine cases of model misspecification and non‐identically distributed data to demonstrate the robustness of our method when some assumptions are violated. We apply the debiasing method to identify important predictors in the progression into dementia of different subtypes. Results from extensive simulations show the superiority of the debiasing method compared to other inference methods.
more »
« less
- Award ID(s):
- 2324489
- PAR ID:
- 10609614
- Publisher / Repository:
- Wiley
- Date Published:
- Journal Name:
- Statistics in Medicine
- Volume:
- 43
- Issue:
- 30
- ISSN:
- 0277-6715
- Page Range / eLocation ID:
- 5711 to 5747
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
When trained on biased datasets, Deep Neural Networks (DNNs) often make predictions based on attributes derived from features spuriously correlated with the target labels. This is especially problematic if these irrelevant features are easier for the model to learn than the truly relevant ones. Many existing approaches, called debiasing methods, have been proposed to address this issue, but they often require predefined bias labels and entail significantly increased computational complexity by incorporating extra auxiliary models. Instead, we provide an orthogonal perspective from the existing approaches, inspired by cognitive science, specifically Global Workspace Theory (GWT). Our method, Debiasing Global Workspace (DGW), is a novel debiasing framework that consists of specialized modules and a shared workspace, allowing for increased modularity and improved debiasing performance. Additionally, DGW enhances the transparency of decision-making processes by visualizing which features of the inputs the model focuses on during training and inference through attention masks. We begin by proposing an instantiation of GWT for the debiasing method. We then outline the implementation of each component within DGW. At the end, we validate our method across various biased datasets, proving its effectiveness in mitigating biases and improving model performance.more » « less
-
Statistical inferences for high-dimensional regression models have been extensively studied for their wide applications ranging from genomics, neuroscience, to economics. However, in practice, there are often potential unmeasured confounders associated with both the response and covariates, which can lead to invalidity of standard debiasing methods. This paper focuses on a generalized linear regression framework with hidden confounding and proposes a debiasing approach to address this high-dimensional problem, by adjusting for the effects induced by the unmeasured confounders. We establish consistency and asymp- totic normality for the proposed debiased estimator. The finite sample performance of the proposed method is demonstrated through extensive numerical studies and an application to a genetic data set.more » « less
-
Many causal and policy effects of interest are defined by linear functionals of high-dimensional or non-parametric regression functions. Root-n consistent and asymptotically normal estimation of the object of interest requires debiasing to reduce the effects of regularization and/or model selection on the object of interest. Debiasing is typically achieved by adding a correction term to the plug-in estimator of the functional, which leads to properties such as semi-parametric efficiency, double robustness, and Neyman orthogonality. We implement an automatic debiasing procedure based on automatically learning the Riesz representation of the linear functional using Neural Nets and Random Forests. Our method only relies on black-box evaluation oracle access to the linear functional and does not require knowledge of its analytic form. We propose a multitasking Neural Net debiasing method with stochastic gradient descent minimization of a combined Riesz representer and regression loss, while sharing representation layers for the two functions. We also propose a Random Forest method which learns a locally linear representation of the Riesz function. Even though our method applies to arbitrary functionals, we experimentally find that it performs well compared to the state of art neural net based algorithm of Shi et al. (2019) for the case of the average treatment effect functional. We also evaluate our method on the problem of estimating average marginal effects with continuous treatments, using semi-synthetic data of gasoline price changes on gasoline demand. Code available at github.com/victor5as/RieszLearning.more » « less
-
We propose a semiparametric Bayesian methodology for estimating the average treatment effect (ATE) within the potential outcomes framework using observational data with high-dimensional nuisance parameters. Our method introduces a Bayesian debiasing procedure that corrects for bias arising from nuisance estimation and employs a targeted modeling strategy based on summary statistics rather than the full data. These summary statistics are identified in a debiased manner, enabling the estimation of nuisance bias via weighted observables and facilitating hierarchical learning of the ATE. By combining debiasing with sample splitting, our approach separates nuisance estimation from inference on the target parameter, reducing sensitivity to nuisance model specification. We establish that, under mild conditions, the marginal posterior for the ATE satisfies a Bernstein-von Mises theorem when both nuisance models are correctly specified and remains consistent and robust when only one is correct, achieving Bayesian double robustness. This ensures asymptotic efficiency and frequentist validity. Extensive simulations confirm the theoretical results, demonstrating accurate point estimation and credible intervals with nominal coverage, even in high-dimensional settings. The proposed framework can also be extended to other causal estimands, and its key principles offer a general foundation for advancing Bayesian semiparametric inference more broadly.more » « less
An official website of the United States government

