Optimal subsampling for quantile regression in big data

Wang, Haiying; Ma, Yanyuan

doi:10.1093/biomet/asaa043

Citation Details

Optimal subsampling for quantile regression in big data

Summary We investigate optimal subsampling for quantile regression. We derive the asymptotic distribution of a general subsampling estimator and then derive two versions of optimal subsampling probabilities. One version minimizes the trace of the asymptotic variance-covariance matrix for a linearly transformed parameter estimator and the other minimizes that of the original parameter estimator. The former does not depend on the densities of the responses given covariates and is easy to implement. Algorithms based on optimal subsampling probabilities are proposed and asymptotic distributions, and the asymptotic optimality of the resulting estimators are established. Furthermore, we propose an iterative subsampling procedure based on the optimal subsampling probabilities in the linearly transformed parameter estimation which has great scalability to utilize available computational resources. In addition, this procedure yields standard errors for parameter estimators without estimating the densities of the responses given the covariates. We provide numerical examples based on both simulated and real data to illustrate the proposed method. more »

Award ID(s):: 1812013

PAR ID:: 10274019

Author(s) / Creator(s):: Wang, Haiying; Ma, Yanyuan

Date Published:: 2020-07-21

Journal Name:: Biometrika

Volume:: 108

Issue:: 1

ISSN:: 0006-3444

Page Range / eLocation ID:: 99 to 112

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1093/biomet/asaa043

More Like this