skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: High-Dimensional Quantile Regression: Convolution Smoothing and Concave Regularization
Abstract ℓ 1 -penalized quantile regression (QR) is widely used for analysing high-dimensional data with heterogeneity. It is now recognized that the ℓ1-penalty introduces non-negligible estimation bias, while a proper use of concave regularization may lead to estimators with refined convergence rates and oracle properties as the signal strengthens. Although folded concave penalized M-estimation with strongly convex loss functions have been well studied, the extant literature on QR is relatively silent. The main difficulty is that the quantile loss is piecewise linear: it is non-smooth and has curvature concentrated at a single point. To overcome the lack of smoothness and strong convexity, we propose and study a convolution-type smoothed QR with iteratively reweighted ℓ1-regularization. The resulting smoothed empirical loss is twice continuously differentiable and (provably) locally strongly convex with high probability. We show that the iteratively reweighted ℓ1-penalized smoothed QR estimator, after a few iterations, achieves the optimal rate of convergence, and moreover, the oracle rate and the strong oracle property under an almost necessary and sufficient minimum signal strength condition. Extensive numerical studies corroborate our theoretical results.  more » « less
Award ID(s):
2113409 1952373
PAR ID:
10398632
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Volume:
84
Issue:
1
ISSN:
1369-7412
Format(s):
Medium: X Size: p. 205-233
Size(s):
p. 205-233
Sponsoring Org:
National Science Foundation
More Like this
  1. Regularized quantile regression (QR) is a useful technique for analyzing heterogeneous data under potentially heavy-tailed error contamination in high dimensions. This paper provides a new analysis of the estimation/prediction error bounds of the global solution of$$L_1$$-regularized QR (QR-LASSO) and the local solutions of nonconvex regularized QR (QR-NCP) when the number of covariates is greater than the sample size. Our results build upon and significantly generalize the earlier work in the literature. For certain heavy-tailed error distributions and a general class of design matrices, the least-squares-based LASSO cannot achieve the near-oracle rate derived under the normality assumption no matter the choice of the tuning parameter. In contrast, we establish that QR-LASSO achieves the near-oracle estimation error rate for a broad class of models under conditions weaker than those in the literature. For QR-NCP, we establish the novel results that all local optima within a feasible region have desirable estimation accuracy. Our analysis applies to not just the hard sparsity setting commonly used in the literature, but also to the soft sparsity setting which permits many small coefficients. Our approach relies on a unified characterization of the global/local solutions of regularized QR via subgradients using a generalized Karush–Kuhn–Tucker condition. The theory of the paper establishes a key property of the subdifferential of the quantile loss function in high dimensions, which is of independent interest for analyzing other high-dimensional nonsmooth problems. 
    more » « less
  2. Abstract A transformed primal-dual (TPD) flow is developed for a class of nonlinear smooth saddle point system. The flow for the dual variable contains a Schur complement which is strongly convex. Exponential stability of the saddle point is obtained by showing the strong Lyapunov property. Several TPD iterations are derived by implicit Euler, explicit Euler, implicit-explicit and Gauss-Seidel methods with accelerated overrelaxation of the TPD flow. Generalized to the symmetric TPD iterations, linear convergence rate is preserved for convex-concave saddle point systems under assumptions that the regularized functions are strongly convex. The effectiveness of augmented Lagrangian methods can be explained as a regularization of the non-strongly convexity and a preconditioning for the Schur complement. The algorithm and convergence analysis depends crucially on appropriate inner products of the spaces for the primal variable and dual variable. A clear convergence analysis with nonlinear inexact inner solvers is also developed. 
    more » « less
  3. Edge machine learning can deliver low-latency and private artificial intelligent (AI) services for mobile devices by leveraging computation and storage resources at the network edge. This paper presents an energy-efficient edge processing framework to execute deep learning inference tasks at the edge computing nodes whose wireless connections to mobile devices are prone to channel uncertainties. Aimed at minimizing the sum of computation and transmission power consumption with probabilistic quality-of-service (QoS) constraints, we formulate a joint inference tasking and downlink beamforming problem that is characterized by a group sparse objective function. We provide a statistical learning based robust optimization approach to approximate the highly intractable probabilistic-QoS constraints by nonconvex quadratic constraints, which are further reformulated as matrix inequalities with a rank-one constraint via matrix lifting. We design a reweighted power minimization approach by iteratively reweighted ℓ1 minimization with difference-of-convex-functions (DC) regularization and updating weights, where the reweighted approach is adopted for enhancing group sparsity whereas the DC regularization is designed for inducing rank-one solutions. Numerical results demonstrate that the proposed approach outperforms other state-of-the-art approaches. 
    more » « less
  4. We consider the problem of inferring the conditional independence graph (CIG) of high-dimensional Gaussian vectors from multi-attribute data. Most existing methods for graph estimation are based on single-attribute models where one associates a scalar random variable with each node. In multi-attribute graphical models, each node represents a random vector. In this paper we provide a unified theoretical analysis of multi-attribute graph learning using a penalized log-likelihood objective function. We consider both convex (sparse-group lasso) and non-convex (log-sum and SCAD group penalties) penalty/regularization functions. We establish sufficient conditions in a high-dimensional setting for consistency (convergence of the precision matrix to true value in the Frobenius norm), local convexity when using non-convex penalties, and graph recovery. We do not impose any incoherence or irrepresentability condition for our convergence results. 
    more » « less
  5. Estimation of the conditional independence graph (CIG) of high-dimensional multivariate Gaussian time series from multi-attribute data is considered. All existing methods for graph estimation for such data are based on single-attribute models where one associates a scalar time series with each node. In multi-attribute graphical models, each node represents a random vector or vector time series. In this paper we provide a unified theoretical analysis of multi-attribute graph learning for dependent time series using a penalized log-likelihood objective function. We consider both convex (sparse-group lasso) and non-convex (log-sum and SCAD group penalties) penalty/regularization functions. We establish sufficient conditions in a high-dimensional setting for consistency (convergence of the inverse power spectral density to true value in the Frobenius norm), local convexity when using non-convex penalties, and graph recovery. We illustrate our approach using numerical examples utilizing both synthetic and real data. 
    more » « less