Centring is a commonly used technique in linear regression analysis. With centred data on both the responses and covariates, the ordinary least squares estimator of the slope parameter can be calculated from a model without the intercept. If a subsample is selected from a centred full data, the subsample is typically uncentred. In this case, is it still appropriate to fit a model without the intercept? The answer is yes, and we show that the least squares estimator on the slope parameter obtained from a model without the intercept is unbiased and it has a smaller variance covariance matrix in the Loewner order than that obtained from a model with the intercept. We further show that for noninformative weighted subsampling when a weighted least squares estimator is used, using the full data weighted means to relocate the subsample improves the estimation efficiency.
- Award ID(s):
- 1656123
- PAR ID:
- 10337626
- Date Published:
- Journal Name:
- Econometrica
- Volume:
- 90
- Issue:
- 3
- ISSN:
- 0012-9682
- Page Range / eLocation ID:
- 1283 to 1294
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Meila, Marina and (Ed.)Least squares estimators, when trained on a few target domain samples, may predict poorly. Supervised domain adaptation aims to improve the predictive accuracy by exploiting additional labeled training samples from a source distribution that is close to the target distribution. Given available data, we investigate novel strategies to synthesize a family of least squares estimator experts that are robust with regard to moment conditions. When these moment conditions are specified using Kullback-Leibler or Wasserstein-type divergences, we can find the robust estimators efficiently using convex optimization. We use the Bernstein online aggregation algorithm on the proposed family of robust experts to generate predictions for the sequential stream of target test samples. Numerical experiments on real data show that the robust strategies systematically outperform non-robust interpolations of the empirical least squares estimators.more » « less
-
We consider the problem of recovering a rank-one matrix when a perturbed subset of its entries is revealed. We propose a method based on least squares in the log-space and show its performance matches the lower bounds that we derive for this problem in the small-perturbation regime, which are related to the spectral gap of a graph representing the revealed entries. Unfortunately, we show that for larger disturbances, potentially exponentially growing errors are unavoidable for any consistent recovery method. We then propose a second algorithm relying on encoding the matrix factorization in the stationary distribution of a certain Markov chain. We show that, under the stronger assumption of known upper and lower bounds on the entries of the true matrix, this second method does not have exponential error growth for large disturbances. Both algorithms can be implemented in nearly linear time.more » « less
-
We consider the problem of recovering a rank one matrix when a perturbed subset of its entries is revealed. We propose a method based on least squares in the log-space and show its performance matches the lower bounds that we derive for this problem in the small perturbation regime, which are related to the spectral gap of a graph representing the revealed entries. Unfortunately, we show that for larger disturbances, potentially exponentially growing errors are unavoidable for any consistent recovery method. We then propose a second algorithm relying on encoding the matrix factorization in the stationary distribution of a certain Markov chain. We show that, under the stronger assumption of known upper and lower bounds on the entries of the true matrix, this second method does not have exponential error growth for large disturbances. Both algorithms can be implemented in nearly linear timemore » « less
-
Summary We study the non-negative garrotte estimator from three different aspects: consistency, computation and flexibility. We argue that the non-negative garrotte is a general procedure that can be used in combination with estimators other than the original least squares estimator as in its original form. In particular, we consider using the lasso, the elastic net and ridge regression along with ordinary least squares as the initial estimate in the non-negative garrotte. We prove that the non-negative garrotte has the nice property that, with probability tending to 1, the solution path contains an estimate that correctly identifies the set of important variables and is consistent for the coefficients of the important variables, whereas such a property may not be valid for the initial estimators. In general, we show that the non-negative garrotte can turn a consistent estimate into an estimate that is not only consistent in terms of estimation but also in terms of variable selection. We also show that the non-negative garrotte has a piecewise linear solution path. Using this fact, we propose an efficient algorithm for computing the whole solution path for the non-negative garrotte. Simulations and a real example demonstrate that the non-negative garrotte is very effective in improving on the initial estimator in terms of variable selection and estimation accuracy.