This paper presents finite‐sample efficiency bounds for the core econometric problem of estimation of linear regression coefficients. We show that the classical Gauss–Markov theorem can be restated omitting the unnatural restriction to linear estimators, without adding any extra conditions. Our results are lower bounds on the variances of unbiased estimators. These lower bounds correspond to the variances of the the least squares estimator and the generalized least squares estimator, depending on the assumption on the error covariances. These results show that we can drop the label “linear estimator” from the pedagogy of the Gauss–Markov theorem. Instead of referring to these estimators as BLUE, they can legitimately be called BUE (best unbiased estimators).
more »
« less
Covariate adjustment in multiarmed, possibly factorial experiments
Abstract Randomized experiments are the gold standard for causal inference and enable unbiased estimation of treatment effects. Regression adjustment provides a convenient way to incorporate covariate information for additional efficiency. This article provides a unified account of its utility for improving estimation efficiency in multiarmed experiments. We start with the commonly used additive and fully interacted models for regression adjustment in estimating average treatment effects (ATE), and clarify the trade-offs between the resulting ordinary least squares (OLS) estimators in terms of finite sample performance and asymptotic efficiency. We then move on to regression adjustment based on restricted least squares (RLS), and establish for the first time its properties for inferring ATE from the design-based perspective. The resulting inference has multiple guarantees. First, it is asymptotically efficient when the restriction is correctly specified. Second, it remains consistent as long as the restriction on the coefficients of the treatment indicators, if any, is correctly specified and separate from that on the coefficients of the treatment-covariate interactions. Third, it can have better finite sample performance than the unrestricted counterpart even when the restriction is moderately misspecified. It is thus our recommendation when the OLS fit of the fully interacted regression risks large finite sample variability in case of many covariates, many treatments, yet a moderate sample size. In addition, the newly established theory of RLS also provides a unified way of studying OLS-based inference from general regression specifications. As an illustration, we demonstrate its value for studying OLS-based regression adjustment in factorial experiments. Importantly, although we analyse inferential procedures that are motivated by OLS, we do not invoke any assumptions required by the underlying linear models.
more »
« less
- Award ID(s):
- 1945136
- PAR ID:
- 10393773
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Journal of the Royal Statistical Society Series B: Statistical Methodology
- Volume:
- 85
- Issue:
- 1
- ISSN:
- 1369-7412
- Page Range / eLocation ID:
- p. 1-23
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this article, we introduce the packagebinsreg, which implements the binscatter methods developed by Cattaneo et al. (2024a, arXiv:2407.15276 [stat.EM]; 2024b,American Economic Review114: 1488–1514). The package comprises seven commands:binsreg, binslogit, binsprobit, binsqreg, binstest binspwc, andbinsregselect. The first four commands implement binscatter plotting, point estimation, and uncertainty quantification (confidence intervals and confidence bands) for least-squares linear binscatter regression (binsreg) and for nonlinear binscatter regression (binslogitfor logit regression,binsprobitfor. probit regression, andbinsqregfor quantile regression). The next two commands focus on pointwise and uniform inference:binstestimplements hypothesis testing procedures for parametric specifications and for nonparametric shape restrictions of the unknown regression function, whilebinspwcimplements multigroup pairwise statistical comparisons. The last command,binsregselect, implements. data-driven number-of-bins selectors. The commands offer binned scatterplots and allow for covariate adjustment, weighting, clustering, and multisample analysis, which is useful when studying treatment-effect heterogeneity in randomizec and observational studies, among many other features.more » « less
-
Summary In two influential contributions, Rosenbaum (2005, 2020a) advocated for using the distances between componentwise ranks, instead of the original data values, to measure covariate similarity when constructing matching estimators of average treatment effects. While the intuitive benefits of using covariate ranks for matching estimation are apparent, there is no theoretical understanding of such procedures in the literature. We fill this gap by demonstrating that Rosenbaum’s rank-based matching estimator, when coupled with a regression adjustment, enjoys the properties of double robustness and semiparametric efficiency without the need to enforce restrictive covariate moment assumptions. Our theoretical findings further emphasize the statistical virtues of employing ranks for estimation and inference, more broadly aligning with the insights put forth by Peter Bickel in his 2004 Rietz lecture.more » « less
-
Abstract Understanding treatment effect heterogeneity is vital to many scientific fields because the same treatment may affect different individuals differently. Quantile regression provides a natural framework for modelling such heterogeneity. We propose a new method for inference on heterogeneous quantile treatment effects (HQTE) in the presence of high-dimensional covariates. Our estimator combines an ℓ1-penalised regression adjustment with a quantile-specific bias correction scheme based on rank scores. We study the theoretical properties of this estimator, including weak convergence and semi-parametric efficiency of the estimated HQTE process. We illustrate the finite-sample performance of our approach through simulations and an empirical example, dealing with the differential effect of statin usage for lowering low-density lipoprotein cholesterol levels for the Alzheimer’s disease patients who participated in the UK Biobank study.more » « less
-
Abstract Cluster-randomized experiments are widely used due to their logistical convenience and policy relevance. To analyse them properly, we must address the fact that the treatment is assigned at the cluster level instead of the individual level. Standard analytic strategies are regressions based on individual data, cluster averages and cluster totals, which differ when the cluster sizes vary. These methods are often motivated by models with strong and unverifiable assumptions, and the choice among them can be subjective. Without any outcome modelling assumption, we evaluate these regression estimators and the associated robust standard errors from the design-based perspective where only the treatment assignment itself is random and controlled by the experimenter. We demonstrate that regression based on cluster averages targets a weighted average treatment effect, regression based on individual data is suboptimal in terms of efficiency and regression based on cluster totals is consistent and more efficient with a large number of clusters. We highlight the critical role of covariates in improving estimation efficiency and illustrate the efficiency gain via both simulation studies and data analysis. The asymptotic analysis also reveals the efficiency-robustness trade-off by comparing the properties of various estimators using data at different levels with and without covariate adjustment. Moreover, we show that the robust standard errors are convenient approximations to the true asymptotic standard errors under the design-based perspective. Our theory holds even when the outcome models are misspecified, so it is model-assisted rather than model-based. We also extend the theory to a wider class of weighted average treatment effects.more » « less
An official website of the United States government
