skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Adaptive Reduced Rank Regression.
We study the low rank regression problem $$\my = M\mx + \epsilon$$, where $$\mx$$ and $$\my$$ are d1 and d2 dimensional vectors respectively. We consider the extreme high-dimensional setting where the number of observations n is less than d1+d2. Existing algorithms are designed for settings where n is typically as large as $$\Rank(M)(d_1+d_2)$$. This work provides an efficient algorithm which only involves two SVD, and establishes statistical guarantees on its performance. The algorithm decouples the problem by first estimating the precision matrix of the features, and then solving the matrix denoising problem. To complement the upper bound, we introduce new techniques for establishing lower bounds on the performance of any algorithm for this problem. Our preliminary experiments confirm that our algorithm often out-performs existing baselines, and is always at least competitive.  more » « less
Award ID(s):
1835821 1755769
PAR ID:
10212766
Author(s) / Creator(s):
Date Published:
Journal Name:
Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We study the low rank regression problem y = Mx + ε, where x and y are d1 and d2 dimensional vectors respectively. We consider the extreme high-dimensional setting where the number of observations n is less than d1 + d2. Existing algorithms are designed for settings where n is typically as large as rank(M)(d1+d2). This work provides an efficient algorithm which only involves two SVD, and establishes statistical guarantees on its performance. The algorithm decouples the problem by first estimating the precision matrix of the features, and then solving the matrix de-noising problem. To complement the upper bound, we introduce new techniques for establishing lower bounds on the performance of any algorithm for this problem. Our preliminary experiments confirm that our algorithm often out-performs existing baseline, and is always at least competitive. 
    more » « less
  2. In this paper, we consider the Collaborative Ranking (CR) problem for recommendation systems. Given a set of pairwise preferences between items for each user, collaborative ranking can be used to rank un-rated items for each user, and this ranking can be naturally used for recommendation. It is observed that collaborative ranking algorithms usually achieve better performance since they directly minimize the ranking loss; however, they are rarely used in practice due to the poor scalability. All the existing CR algorithms have time complexity at least O(|Ω|r) per iteration, where r is the target rank and |Ω| is number of pairs which grows quadratically with number of ratings per user. For example, the Netflix data contains totally 20 billion rating pairs, and at this scale all the current algorithms have to work with significant subsampling, resulting in poor prediction on testing data. In this paper, we propose a new collaborative ranking algorithm called Primal-CR that reduces the time complexity toO(|Ω|+d1d2r), where d1 is number of users and d2 is the averaged number of items rated by a user. Note that d1, d2 is strictly smaller and open much smaller than |Ω|. Furthermore, by exploiting the fact that most data is in the form of numerical ratings instead of pairwise comparisons, we propose Primal-CR++ with O(d1d2(r + log d2)) time complexity. Both algorithms have better theoretical time complexity than existing approaches and also outperform existing approaches in terms of NDCG and pairwise error on real data sets. To the best of our knowledge, this is the first collaborative ranking algorithm capable of working on the full Netflix dataset using all the 20 billion rating pairs, and this leads to a model with much better recommendation compared with previous models trained on subsamples. Finally, compared with classical matrix factorization algorithm which also requires O(d1 d2r) time, our algorithm has almost the same efficiency while making much better recommendations since we consider the ranking loss. 
    more » « less
  3. In the static retrieval problem, a data structure must answer retrieval queries mapping a set of n keys in a universe [U] to v-bit values. Information-theoretically, retrieval data structures can use as little as nv bits of space. For small value sizes v, it is possible to achieve O(1) query time while using space nv+o(n) bits-whether or not such a result is possible for larger values of v (e.g., v=Θ(logn)) has remained open.In this paper, we obtain a tight lower bound (as well as matching upper bounds) for the static retrieval problem. In the case where values are large, we show that there is actually a significant tension between time and space. It is not possible, for example, to get O(1) query time using nv+o(n) bits of space, when v=Θ(logn) (and assuming the word RAM model with O(logn)-bit words)At first glance, our lower bound would seem to render retrieval unusable in many settings that aim to achieve very low redundancy. However, our second result offers a way around this: We show that, whenever a retrieval data structure D1 is stored along with another data structure D2 (whose size is similar to or larger than the size of D1), it is possible to implement the combined data structure D1∪D2 so that queries to D1 take O(1) time, operations on D2 take the same asymptotic time as if D2 were stored on its own, and the total space is nv+Space(D2)+n0.67 bits. 
    more » « less
  4. While matrix-covariate regression models have been studied in many existing works, classical statistical and computational methods for the analysis of the regression coefficient estimation are highly affected by high dimensional matrix-valued covariates. To address these issues, this paper proposes a framework of matrix-covariate regression models based on a low-rank constraint and an additional regularization term for structured signals, with considerations of models of both continuous and binary responses. We propose an efficient Riemannian-steepest-descent algorithm for regression coefficient estimation. We prove that the consistency of the proposed estimator is in the order of O(sqrt{r(q+m)+p}/sqrt{n}), where r is the rank, p x m is the dimension of the coefficient matrix and p is the dimension of the coefficient vector. When the rank r is small, this rate improves over O(sqrt{qm+p}/sqrt{n}), the consistency of the existing work (Li et al. in Electron J Stat 15:1909-1950, 2021) that does not apply a rank constraint. In addition, we prove that all accumulation points of the iterates have similar estimation errors asymptotically and substantially attaining the minimax rate. We validate the proposed method through a simulated dataset on two-dimensional shape images and two real datasets of brain signals and microscopic leucorrhea images. 
    more » « less
  5. null (Ed.)
    In the low-rank matrix completion (LRMC) problem, the low-rank assumption means that the columns (or rows) of the matrix to be completed are points on a low-dimensional linear algebraic variety. This paper extends this thinking to cases where the columns are points on a low-dimensional nonlinear algebraic variety, a problem we call Low Algebraic Dimension Matrix Completion (LADMC). Matrices whose columns belong to a union of subspaces are an important special case. We propose a LADMC algorithm that leverages existing LRMC methods on a tensorized representation of the data. For example, a second-order tensorized representation is formed by taking the Kronecker product of each column with itself, and we consider higher order tensorizations as well. This approach will succeed in many cases where traditional LRMC is guaranteed to fail because the data are low-rank in the tensorized representation but not in the original representation. We also provide a formal mathematical justi cation for the success of our method. In particular, we give bounds of the rank of these data in the tensorized representation, and we prove sampling requirements to guarantee uniqueness of the solution. We also provide experimental results showing that the new approach outperforms existing state-of-the-art methods for matrix completion under a union of subspaces model. 
    more » « less