skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The influence function of semiparametric estimators
There are many economic parameters that depend on nonparametric first steps. Examples include games, dynamic discrete choice, average exact consumer surplus, and treatment effects. Often estimators of these parameters are asymptotically equivalent to a sample average of an object referred to as the influence function. The influence function is useful in local policy analysis, in evaluating local sensitivity of estimators, and constructing debiased machine learning estimators. We show that the influence function is a Gateaux derivative with respect to a smooth deviation evaluated at a point mass. This result generalizes the classic Von Mises (1947) and Hampel (1974) calculation to estimators that depend on smooth nonparametric first steps. We give explicit influence functions for first steps that satisfy exogenous or endogenous orthogonality conditions. We use these results to generalize the omitted variable bias formula for regression to policy analysis for and sensitivity to structural changes. We apply this analysis and find no sensitivity to endogeneity of average equivalent variation estimates in a gasoline demand application.  more » « less
Award ID(s):
1757140
PAR ID:
10469513
Author(s) / Creator(s):
;
Publisher / Repository:
Journal of the Econometric Society
Date Published:
Journal Name:
Quantitative Economics
Volume:
13
Issue:
1
ISSN:
1759-7323
Page Range / eLocation ID:
29 to 61
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Imbens, G. (Ed.)
    Many economic and causal parameters depend on nonparametric or high dimensional first steps. We give a general construction of locally robust/orthogonal moment functions for GMM, where first steps have no effect, locally, on average moment functions. Using these orthogonal moments reduces model selection and regularization bias, as is important in many applications, especially for machine learning first steps. Also, associated standard errors are robust to misspecification when there is the same number of moment functions as parameters of interest. We use these orthogonal moments and cross-fitting to construct debiased machine learning estimators of functions of high dimensional conditional quantiles and of dynamic discrete choice parameters with high dimensional state variables. We show that additional first steps needed for the orthogonal moment functions have no effect, globally, on average orthogonal moment functions. We give a general approach to estimating those additional first steps.We characterize double robustness and give a variety of new doubly robust moment functions.We give general and simple regularity conditions for asymptotic theory. 
    more » « less
  2. Estimation of heterogeneous causal effects—that is, how effects of policies and treatments vary across subjects—is a fundamental task in causal inference. Many methods for estimating conditional average treatment effects (CATEs) have been proposed in recent years, but questions surrounding optimality have remained largely unanswered. In particular, a minimax theory of optimality has yet to be developed, with the minimax rate of convergence and construction of rate-optimal estimators remaining open problems. In this paper, we derive the minimax rate for CATE estimation, in a Hölder-smooth nonparametric model, and present a new local polynomial estimator, giving high-level conditions under which it is minimax optimal. Our minimax lower bound is derived via a localized version of the method of fuzzy hypotheses, combining lower bound constructions for nonparametric regression and functional estimation. Our proposed estimator can be viewed as a local polynomial R-Learner, based on a localized modification of higher-order influence function methods. The minimax rate we find exhibits several interesting features, including a nonstandard elbow phenomenon and an unusual interpolation between nonparametric regression and functional estimation rates. The latter quantifies how the CATE, as an estimand, can be viewed as a regression/functional hybrid. 
    more » « less
  3. We study Markov potential games under the infinite horizon average reward criterion. Most previous studies have been for discounted rewards. We prove that both algorithms based on independent policy gradient and independent natural policy gradient converge globally to a Nash equilibrium for the average reward criterion. To set the stage for gradient-based methods, we first establish that the average reward is a smooth function of policies and provide sensitivity bounds for the differential value functions, under certain conditions on ergodicity and the second largest eigenvalue of the underlying Markov decision process (MDP). We prove that three algorithms, policy gradient, proximal-Q, and natural policy gradient (NPG), converge to an ϵ-Nash equilibrium with time complexity O(1ϵ2), given a gradient/differential Q function oracle. When policy gradients have to be estimated, we propose an algorithm with ~O(1mins,aπ(a|s)δ) sample complexity to achieve δ approximation error w.r.t~the ℓ2 norm. Equipped with the estimator, we derive the first sample complexity analysis for a policy gradient ascent algorithm, featuring a sample complexity of ~O(1/ϵ5). Simulation studies are presented. 
    more » « less
  4. Baseband processing algorithms often require knowledge of the noise power, signal power, or signal-to-noise ratio (SNR). In practice, these parameters are typically unknown and must be estimated. Furthermore, the mean-square error (MSE) is a desirable metric to be minimized in a variety of estimation and signal recovery algorithms. However, the MSE cannot directly be used as it depends on the true signal that is generally unknown to the estimator. In this paper, we propose novel blind estimators for the average noise power, average receive signal power, SNR, and MSE. The proposed estimators can be computed at low complexity and solely rely on the large-dimensional and sparse nature of the processed data. Our estimators can be used (i) to quickly track some of the key system parameters while avoiding additional pilot overhead, (ii) to design low-complexity nonparametric algorithms that require such quantities, and (iii) to accelerate more sophisticated estimation or recovery algorithms. We conduct a theoretical analysis of the proposed estimators for a Bernoulli complex Gaussian (BCG) prior, and we demonstrate their efficacy via synthetic experiments. We also provide three application examples that deviate from the BCG prior in millimeter-wave multi-antenna and cell-free wireless systems for which we develop nonparametric denoising algorithms that improve channel-estimation accuracy with a performance comparable to denoisers that assume perfect knowledge of the system parameters. 
    more » « less
  5. We discuss a broad class of difference‐based estimators of the autocovariance function in a semiparametric regression model where the signal consists of the sum of a smooth function and another stepwise function whose number of jumps and locations are unknown (change points) while the errors are stationary and ‐dependent. We establish that the influence of the smooth part of the signal over the bias of our estimators is negligible; this is a general result as it does not depend on the distribution of the errors. We show that the influence of the unknown smooth function is negligible also in the mean squared error (MSE) of our estimators. Although we assumed Gaussian errors to derive the latter result, our finite sample studies suggest that the class of proposed estimators still show small MSE when the errors are not Gaussian. Our simulation study also demonstrates that, when the error process is mis‐specified as an AR instead of an ‐dependent process, our proposed method can estimate autocovariances about as well as some methods specifically designed for the AR(1) case, and sometimes even better than them. We also allow both the number of change points and the magnitude of the largest jump grow with the sample size . In this case, we provide conditions on the interplay between the growth rate of these two quantities as well as the vanishing rate of the modulus of continuity (of the signal's smooth part) that ensure consistency of our autocovariance estimators. As an application, we use our approach to provide a better understanding of the possible autocovariance structure of a time series of global averaged annual temperature anomalies. Finally, the R package dbacf complements this article. 
    more » « less