skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Primal-Dual Analysis of Global Optimality in Nonconvex Low-Rank Matrix Recovery
We propose a primal-dual based framework for analyzing the global optimality of nonconvex low-rank matrix recovery. Our analysis are based on the restricted strongly convex and smooth conditions, which can be verified for a broad family of loss functions. In addition, our analytic framework can directly handle the widely-used incoherence constraints through the lens of duality. We illustrate the applicability of the proposed framework to matrix completion and one-bit matrix completion, and prove that all these problems have no spurious local minima. Our results not only improve the sample complexity required for characterizing the global optimality of matrix completion, but also resolve an open problem in Ge et al. (2017) regarding one-bit matrix completion. Numerical experiments show that primal-dual based algorithm can successfully recover the global optimum for various low-rank problems.  more » « less
Award ID(s):
1652539 1618948
PAR ID:
10063547
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
International Conference on Machine Learning
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Aardal, Karen; Sanità, Laura (Ed.)
    This paper considers the interplay between semidefinite programming, matrix rank, and graph coloring. Karger, Motwani, and Sudan [10] give a vector program for which a coloring of the graph can be encoded as a semidefinite matrix of low rank. By complementary slackness conditions of semidefinite programming, if an optimal dual solution has sufficiently high rank, any optimal primal solution must have low rank. We attempt to characterize graphs for which we can show that the corresponding dual optimal solution must have sufficiently high rank. In the case of the original Karger, Motwani, and Sudan vector program, we show that any graph which is a k-tree has sufficiently high dual rank, and we can extract the coloring from the corresponding low-rank primal solution. We can also show that if the graph is not uniquely colorable, then no sufficiently high rank dual optimal solution can exist. This allows us to completely characterize the planar graphs for which dual optimal solutions have sufficiently high dual rank, since it is known that the uniquely colorable planar graphs are precisely the planar 3-trees. We then modify the semidefinite program to have an objective function with costs, and explore when we can create a cost function whose optimal dual solution has sufficiently high rank. We show that it is always possible to construct such a cost function given the graph coloring. The construction of the cost function gives rise to a heuristic for graph coloring which we show works well in the case of planar graphs; we enumerated all maximal planar graphs with a K4 of up to 14 vertices, and the heuristics successfully colored 99.75% of them. Our research was motivated by the Colin de Verdière graph invariant [5] (and a corresponding conjecture of Colin de Verdière), in which matrices that have some similarities to the dual feasible matrices must have high rank in the case that graphs are of a certain type; for instance, planar graphs have rank that would imply the 4-colorability of the primal solution. We explore the connection between the conjecture and the rank of the dual solutions. 
    more » « less
  2. This paper considers the interplay between semidefinite programming, matrix rank, and graph coloring. Karger, Motwani, and Sudan give a vector program in which a coloring of a graph can be encoded as a semidefinite matrix of low rank. By complementary slackness conditions of semidefinite programming, if an optimal dual solution has high rank, any optimal primal solution must have low rank. We attempt to characterize graphs for which we can show that the corresponding dual optimal solution must have rank high enough that the primal solution encodes a coloring. In the case of the original Karger, Motwani, and Sudan vector program, we show that any graph which is a $$k$$-tree has sufficiently high dual rank, and we can extract the coloring from the corresponding low-rank primal solution. We can also show that if a graph is not uniquely colorable, then no sufficiently high rank dual optimal solution can exist. This allows us to completely characterize the planar graphs for which dual optimal solutions have sufficiently high dual rank, since it is known that the uniquely colorable planar graphs are precisely the planar 3-trees. We then modify the semidefinite program to have an objective function with costs, and explore when we can create an objective function such that the optimal dual solution has sufficiently high rank. We show that it is always possible to construct such an objective function given the graph coloring. The construction of the objective function gives rise to heuristics for 4-coloring planar graphs. We enumerated all maximal planar graphs with an induced $$K_4$$ of up to 14 vertices; the heuristics successfully found a 4-coloring for 99.75\% of them. Our research was motivated by trying to use semidefinite programming to prove the four-color theorem, which states that every planar graph can be colored with four colors. There is an intriguing connection of the Karger-Motwani-Sudan semidefinite program with the Colin de Verdi\`ere graph invariant (and a corresponding conjecture of Colin de Verdi\`ere), in which matrices that have some similarities to the dual feasible matrices of the semidefinite program must have high rank in the case that graphs are of a certain type; for instance, planar graphs have rank that would imply that the primal solution of the semidefinite program encodes a 4-coloring. 
    more » « less
  3. Low-rank matrix recovery is a fundamental problem in machine learning with numerous applications. In practice, the problem can be solved by convex optimization namely nuclear norm minimization, or by non-convex optimization as it is well-known that for low-rank matrix problems like matrix sensing and matrix completion, all local optima of the natural non-convex objectives are also globally optimal under certain ideal assumptions. In this paper, we study new approaches for matrix sensing in a semi-random model where an adversary can add any number of arbitrary sensing matrices. More precisely, the problem is to recover a low-rank matrix $$X^\star$$ from linear measurements $$b_i = \langle A_i, X^\star \rangle$$, where an unknown subset of the sensing matrices satisfies the Restricted Isometry Property (RIP) and the rest of the $$A_i$$'s are chosen adversarially. It is known that in the semi-random model, existing non-convex objectives can have bad local optima. To fix this, we present a descent-style algorithm that provably recovers the ground-truth matrix $$X^\star$$. For the closely-related problem of semi-random matrix completion, prior work [CG18] showed that all bad local optima can be eliminated by reweighting the input data. However, the analogous approach for matrix sensing requires reweighting a set of matrices to satisfy RIP, which is a condition that is NP-hard to check. Instead, we build on the framework proposed in [KLL$^+$23] for semi-random sparse linear regression, where the algorithm in each iteration reweights the input based on the current solution, and then takes a weighted gradient step that is guaranteed to work well locally. Our analysis crucially exploits the connection between sparsity in vector problems and low-rankness in matrix problems, which may have other applications in obtaining robust algorithms for sparse and low-rank problems. 
    more » « less
  4. null (Ed.)
    Abstract One of the classical approaches for estimating the frequencies and damping factors in a spectrally sparse signal is the MUltiple SIgnal Classification (MUSIC) algorithm, which exploits the low-rank structure of an autocorrelation matrix. Low-rank matrices have also received considerable attention recently in the context of optimization algorithms with partial observations, and nuclear norm minimization (NNM) has been widely used as a popular heuristic of rank minimization for low-rank matrix recovery problems. On the other hand, it has been shown that NNM can be viewed as a special case of atomic norm minimization (ANM), which has achieved great success in solving line spectrum estimation problems. However, as far as we know, the general ANM (not NNM) considered in many existing works can only handle frequency estimation in undamped sinusoids. In this work, we aim to fill this gap and deal with damped spectrally sparse signal recovery problems. In particular, inspired by the dual analysis used in ANM, we offer a novel optimization-based perspective on the classical MUSIC algorithm and propose an algorithm for spectral estimation that involves searching for the peaks of the dual polynomial corresponding to a certain NNM problem, and we show that this algorithm is in fact equivalent to MUSIC itself. Building on this connection, we also extend the classical MUSIC algorithm to the missing data case. We provide exact recovery guarantees for our proposed algorithms and quantify how the sample complexity depends on the true spectral parameters. In particular, we provide a parameter-specific recovery bound for low-rank matrix recovery of jointly sparse signals rather than use certain incoherence properties as in existing literature. Simulation results also indicate that the proposed algorithms significantly outperform some relevant existing methods (e.g., ANM) in frequency estimation of damped exponentials. 
    more » « less
  5. Matrices are exceptionally useful in various fields of study as they provide a convenient framework to organize and manipulate data in a structured manner. However, modern matrices can involve billions of elements, making their storage and processing quite demanding in terms of computational resources and memory usage. Although prohibitively large, such matrices are often approximately low rank. We propose an algorithm that exploits this structure to obtain a low rank decomposition of any matrix A as A≈LR, where L and R are the low rank factors. The total number of elements in L and R can be significantly less than that in A. Furthermore, the entries of L and R are quantized to low precision formats −− compressing A by giving us a low rank and low precision factorization. Our algorithm first computes an approximate basis of the range space of A by randomly sketching its columns, followed by a quantization of the vectors constituting this basis. It then computes approximate projections of the columns of A onto this quantized basis. We derive upper bounds on the approximation error of our algorithm, and analyze the impact of target rank and quantization bit-budget. The tradeoff between compression ratio and approximation accuracy allows for flexibility in choosing these parameters based on specific application requirements. We empirically demonstrate the efficacy of our algorithm in image compression, nearest neighbor classification of image and text embeddings, and compressing the layers of LlaMa-7b. Our results illustrate that we can achieve compression ratios as aggressive as one bit per matrix coordinate, all while surpassing or maintaining the performance of traditional compression techniques. 
    more » « less