skip to main content


Title: Non‐convex penalized multitask regression using data depth‐based penalties

We propose a new class of non‐convex penalties based on data depth functions for multitask sparse penalized regression. These penalties quantify the relative position of rows of the coefficient matrix from a fixed distribution centred at the origin. We derive the theoretical properties of an approximate one‐step sparse estimator of the coefficient matrix using local linear approximation of the penalty function and provide an algorithm for its computation. For the orthogonal design and independent responses, the resulting thresholding rule enjoys near‐minimax optimal risk performance, similar to the adaptive lasso (Zou, H (2006), ‘The adaptive lasso and its oracle properties’,Journal of the American Statistical Association, 101, 1418–1429). A simulation study and real data analysis demonstrate its effectiveness compared with some of the present methods that provide sparse solutions in multitask regression. Copyright © 2018 John Wiley & Sons, Ltd.

 
more » « less
NSF-PAR ID:
10053612
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Stat
Volume:
7
Issue:
1
ISSN:
2049-1573
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary We develop a Bayesian methodology aimed at simultaneously estimating low-rank and row-sparse matrices in a high-dimensional multiple-response linear regression model. We consider a carefully devised shrinkage prior on the matrix of regression coefficients which obviates the need to specify a prior on the rank, and shrinks the regression matrix towards low-rank and row-sparse structures. We provide theoretical support to the proposed methodology by proving minimax optimality of the posterior mean under the prediction risk in ultra-high-dimensional settings where the number of predictors can grow subexponentially relative to the sample size. A one-step post-processing scheme induced by group lasso penalties on the rows of the estimated coefficient matrix is proposed for variable selection, with default choices of tuning parameters. We additionally provide an estimate of the rank using a novel optimization function achieving dimension reduction in the covariate space. We exhibit the performance of the proposed methodology in an extensive simulation study and a real data example. 
    more » « less
  2. Abstract

    Quantiles and expectiles have been receiving much attention in many areas such as economics, ecology, and finance. By means ofLpoptimization, both quantiles and expectiles can be embedded in a more general class of M‐quantiles. Inspired by this point of view, we propose a generalized regression calledLp‐quantile regression to study the whole conditional distribution of a response variable given predictors in a heterogeneous regression setting. In this article, we focus on the variable selection aspect of high‐dimensional penalizedLp‐quantile regression, which provides a flexible application and makes a complement to penalized quantile and expectile regressions. This generalized penalizedLp‐quantile regression steers an advantageous middle course between ordinary penalized quantile and expectile regressions without sacrificing their virtues too much when 1 < p < 2, that is, offers versatility and flexibility with these ‘quantile‐like’ and robustness properties. We develop the penalizedLp‐quantile regression withscadand adaptivelassopenalties. With properly chosen tuning parameters, we show that the proposed estimators display oracle properties. Numerical studies and real data analysis demonstrate the competitive performance of the proposed penalizedLp‐quantile regression when 1 < p < 2, and they combine the robustness properties of quantile regression with the efficiency of penalized expectile regression. These properties would be helpful for practitioners.

     
    more » « less
  3. We consider the problem of estimating the structure of an undirected weighted sparse graphical model of multivariate data under the assumption that the underlying distribution is multivariate totally positive of order 2, or equivalently, all partial correlations are non-negative. Total positivity holds in several applications. The problem of Gaussian graphical model learning has been widely studied without the total positivity assumption where the problem can be formulated as estimation of the sparse precision matrix that encodes conditional dependence between random variables associated with the graph nodes. An approach that imposes total positivity is to assume that the precision matrix obeys the Laplacian constraints which include constraining the off-diagonal elements of the precision matrix to be non-positive. In this paper we investigate modifications to widely used penalized log-likelihood approaches to enforce total positivity but not the Laplacian structure. An alternating direction method of multipliers (ADMM) algorithm is presented for constrained optimization under total positivity and lasso as well as adaptive lasso penalties. Numerical results based on synthetic data show that the proposed constrained adaptive lasso approach significantly outperforms existing Laplacian-based approaches, both statistical and smoothness-based non-statistical. 
    more » « less
  4. Abstract

    Multi-view data have been routinely collected in various fields of science and engineering. A general problem is to study the predictive association between multivariate responses and multi-view predictor sets, all of which can be of high dimensionality. It is likely that only a few views are relevant to prediction, and the predictors within each relevant view contribute to the prediction collectively rather than sparsely. We cast this new problem under the familiar multivariate regression framework and propose an integrative reduced-rank regression (iRRR), where each view has its own low-rank coefficient matrix. As such, latent features are extracted from each view in a supervised fashion. For model estimation, we develop a convex composite nuclear norm penalization approach, which admits an efficient algorithm via alternating direction method of multipliers. Extensions to non-Gaussian and incomplete data are discussed. Theoretically, we derive non-asymptotic oracle bounds of iRRR under a restricted eigenvalue condition. Our results recover oracle bounds of several special cases of iRRR including Lasso, group Lasso, and nuclear norm penalized regression. Therefore, iRRR seamlessly bridges group-sparse and low-rank methods and can achieve substantially faster convergence rate under realistic settings of multi-view learning. Simulation studies and an application in the Longitudinal Studies of Aging further showcase the efficacy of the proposed methods.

     
    more » « less
  5. Multi‐view data have been routinely collected in various fields of science and engineering. A general problem is to study the predictive association between multivariate responses and multi‐view predictor sets, all of which can be of high dimensionality. It is likely that only a few views are relevant to prediction, and the predictors within each relevant view contribute to the prediction collectively rather than sparsely. We cast this new problem under the familiar multivariate regression framework and propose an integrative reduced‐rank regression (iRRR), where each view has its own low‐rank coefficient matrix. As such, latent features are extracted from each view in a supervised fashion. For model estimation, we develop a convex composite nuclear norm penalization approach, which admits an efficient algorithm via alternating direction method of multipliers. Extensions to non‐Gaussian and incomplete data are discussed. Theoretically, we derive non‐asymptotic oracle bounds of iRRR under a restricted eigenvalue condition. Our results recover oracle bounds of several special cases of iRRR including Lasso, group Lasso, and nuclear norm penalized regression. Therefore, iRRR seamlessly bridges group‐sparse and low‐rank methods and can achieve substantially faster convergence rate under realistic settings of multi‐view learning. Simulation studies and an application in the Longitudinal Studies of Aging further showcase the efficacy of the proposed methods. 
    more » « less