We provide adaptive inference methods, based on $\ell _1$ regularization, for regular (semiparametric) and nonregular (nonparametric) linear functionals of the conditional expectation function. Examples of regular functionals include average treatment effects, policy effects, and derivatives. Examples of nonregular functionals include average treatment effects, policy effects, and derivatives conditional on a covariate subvector fixed at a point. We construct a Neyman orthogonal equation for the target parameter that is approximately invariant to small perturbations of the nuisance parameters. To achieve this property, we include the Riesz representer for the functional as an additional nuisance parameter. Our analysis yields weak ‘double sparsity robustness’: either the approximation to the regression or the approximation to the representer can be ‘completely dense’ as long as the other is sufficiently ‘sparse’. Our main results are nonasymptotic and imply asymptotic uniform validity over large classes of models, translating into honest confidence bands for both global and local parameters.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Summary 
This article surveys the development of nonparametric models and methods for estimation of choice models with nonlinear budget sets. The discussion focuses on the budget set regression, that is, the conditional expectation of a choice variable given the budget set. Utility maximization in a nonparametric model with general heterogeneity reduces the curse of dimensionality in this regression. Empirical results using this regression are different from maximum likelihood and give informative inference. The article also considers the information provided by kink probabilities for nonparametric utility with general heterogeneity. Instrumental variable estimation and the evidence it provides of heterogeneity in preferences are also discussed.more » « less

Shape restrictions have played a central role in economics as both testable implications of theory and sufficient conditions for obtaining informative counterfactual predictions. In this paper, we provide a general procedure for inference under shape restrictions in identified and partially identified models defined by conditional moment restrictions. Our test statistics and proposed inference methods are based on the minimum of the generalized method of moments (GMM) objective function with and without shape restrictions. Uniformly valid critical values are obtained through a bootstrap procedure that approximates a subset of the true local parameter space. In an empirical analysis of the effect of childbearing on female labor supply, we show that employing shape restrictions in linear instrumental variables (IV) models can lead to shorter confidence regions for both local and average treatment effects. Other applications we discuss include inference for the variability of quantile IV treatment effects and for bounds on average equivalent variation in a demand model with general heterogeneity.

Many causal and policy effects of interest are defined by linear functionals of highdimensional or nonparametric regression functions. Rootn consistent and asymptotically normal estimation of the object of interest requires debiasing to reduce the effects of regularization and/or model selection on the object of interest. Debiasing is typically achieved by adding a correction term to the plugin estimator of the functional, which leads to properties such as semiparametric efficiency, double robustness, and Neyman orthogonality. We implement an automatic debiasing procedure based on automatically learning the Riesz representation of the linear functional using Neural Nets and Random Forests. Our method only relies on blackbox evaluation oracle access to the linear functional and does not require knowledge of its analytic form. We propose a multitasking Neural Net debiasing method with stochastic gradient descent minimization of a combined Riesz representer and regression loss, while sharing representation layers for the two functions. We also propose a Random Forest method which learns a locally linear representation of the Riesz function. Even though our method applies to arbitrary functionals, we experimentally find that it performs well compared to the state of art neural net based algorithm of Shi et al. (2019) for the case of the average treatment effect functional. We also evaluate our method on the problem of estimating average marginal effects with continuous treatments, using semisynthetic data of gasoline price changes on gasoline demand. Code available at github.com/victor5as/RieszLearning.more » « less

Imbens, G. (Ed.)Many economic and causal parameters depend on nonparametric or high dimensional first steps. We give a general construction of locally robust/orthogonal moment functions for GMM, where first steps have no effect, locally, on average moment functions. Using these orthogonal moments reduces model selection and regularization bias, as is important in many applications, especially for machine learning first steps. Also, associated standard errors are robust to misspecification when there is the same number of moment functions as parameters of interest. We use these orthogonal moments and crossfitting to construct debiased machine learning estimators of functions of high dimensional conditional quantiles and of dynamic discrete choice parameters with high dimensional state variables. We show that additional first steps needed for the orthogonal moment functions have no effect, globally, on average orthogonal moment functions. We give a general approach to estimating those additional first steps.We characterize double robustness and give a variety of new doubly robust moment functions.We give general and simple regularity conditions for asymptotic theory.more » « less

Debiased machine learning is a metaalgorithm based on bias correction and sample splitting to calculate confidence intervals for functionals, i.e., scalar summaries, of machine learning algorithms. For example, an analyst may seek the confidence interval for a treatment effect estimated with a neural network. We present a nonasymptotic debiased machine learning theorem that encompasses any global or local functional of any machine learning algorithm that satisfies a few simple, interpretable conditions. Formally, we prove consistency, Gaussian approximation and semiparametric efficiency by finitesample arguments. The rate of convergence is $n^{1/2}$ for global functionals, and it degrades gracefully for local functionals. Our results culminate in a simple set of conditions that an analyst can use to translate modern learning theory rates into traditional statistical inference. The conditions reveal a general double robustness property for illposed inverse problems.more » « less

Many causal and structural effects depend on regressions. Examples include policy effects, average derivatives, regression decompositions, average treatment effects, causal mediation, and parameters of economic structural models. The regressions may be high‐dimensional, making machine learning useful. Plugging machine learners into identifying equations can lead to poor inference due to bias from regularization and/or model selection. This paper gives automatic debiasing for linear and nonlinear functions of regressions. The debiasing is automatic in using Lasso and the function of interest without the full form of the bias correction. The debiasing can be applied to any regression learner, including neural nets, random forests, Lasso, boosting, and other high‐dimensional methods. In addition to providing the bias correction, we give standard errors that are robust to misspecification, convergence rates for the bias correction, and primitive conditions for asymptotic inference for estimators of a variety of estimators of structural and causal effects. The automatic debiased machine learning is used to estimate the average treatment effect on the treated for the NSW job training data and to estimate demand elasticities from Nielsen scanner data while allowing preferences to be correlated with prices and income.more » « less

Highdimensional linear models with endogenous variables play an increasingly important role in the recent econometric literature. In this work, we allow for models with many endogenous variables and make use of many instrumental variables to achieve identification. Because of the highdimensionality in the structural equation, constructing honest confidence regions with asymptotically correct coverage is nontrivial. Our main contribution is to propose estimators and confidence regions that achieve this goal. Our approach relies on moment conditions that satisfy the usual instrument orthogonality condition but also have an additional orthogonality property with respect to specific linear combinations of the endogenous variables which are treated as nuisance parameters. We propose new pivotal procedures for estimating the highdimensional nuisance parameters which appear in our formulation. We use a multiplier bootstrap procedure to compute critical values and establish its validity for achieving simultaneously valid confidence regions for a potentially highdimensional set of endogenous variable coefficients.more » « less

There are many economic parameters that depend on nonparametric first steps. Examples include games, dynamic discrete choice, average exact consumer surplus, and treatment effects. Often estimators of these parameters are asymptotically equivalent to a sample average of an object referred to as the influence function. The influence function is useful in local policy analysis, in evaluating local sensitivity of estimators, and constructing debiased machine learning estimators. We show that the influence function is a Gateaux derivative with respect to a smooth deviation evaluated at a point mass. This result generalizes the classic Von Mises (1947) and Hampel (1974) calculation to estimators that depend on smooth nonparametric first steps. We give explicit influence functions for first steps that satisfy exogenous or endogenous orthogonality conditions. We use these results to generalize the omitted variable bias formula for regression to policy analysis for and sensitivity to structural changes. We apply this analysis and find no sensitivity to endogeneity of average equivalent variation estimates in a gasoline demand application.more » « less

Multidimensional heterogeneity and endogeneity are important features of models with multiple treatments. We consider a heterogeneous coefficients model where the outcome is a linear combination of dummy treatment variables, with each variable representing a different kind of treatment. We use control variables to give necessary and sufficient conditions for identification of average treatment effects. With mutually exclusive treatments we find that, provided the heterogeneous coefficients are mean independent from treatments given the controls, a simple identification condition is that the generalized propensity scores (Imbens, 2000) be bounded away from zero and that their sum be bounded away from one, with probability one. Our analysis extends to distributional and quantile treatment effects, as well as corresponding treatment effects on the treated. These results generalize the classical identification result of Rosenbaum & Rubin (1983) for binary treatments.more » « less