skip to main content

Title: The nonconvex geometry of low-rank matrix optimizations with general objective functions
This work considers the minimization of a general convex function f (X) over the cone of positive semi-definite matrices whose optimal solution X* is of low-rank. Standard first-order convex solvers require performing an eigenvalue decomposition in each iteration, severely limiting their scalability. A natural nonconvex reformulation of the problem factors the variable X into the product of a rectangular matrix with fewer columns and its transpose. For a special class of matrix sensing and completion problems with quadratic objective functions, local search algorithms applied to the factored problem have been shown to be much more efficient and, in spite of being nonconvex, to converge to the global optimum. The purpose of this work is to extend this line of study to general convex objective functions f (X) and investigate the geometry of the resulting factored formulations. Specifically, we prove that when f (X) satisfies the restricted well-conditioned assumption, each critical point of the factored problem either corresponds to the optimal solution X* or a strict saddle where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation ensures that many local search algorithms can converge to the global optimum with random initializations.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
The nonconvex geometry of low-rank matrix optimizations with general objective functions
Page Range / eLocation ID:
1235 to 1239
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper considers the minimization of a general objective function f (X) over the set of non-square n × m matrices where the optimal solution X* is low-rank. To reduce the computational burden, we factorize the variable X into a product of two smaller matrices and optimize over these two matrices instead of X. We analyze the global geometry for a general and yet well-conditioned objective function f (X) whose restricted strong convexity and restricted strong smoothness constants are comparable. In particular, we show that the reformulated objective function has no spurious local minima and obeys the strict saddle property. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) can provably solve the factored problem with global convergence. 
    more » « less
  2. Newly, there has been significant research interest in the exact solution of the AC optimal power flow (AC-OPF) problem. A semideflnite relaxation solves many OPF problems globally. However, the real problem exists in which the semidefinite relaxation fails to yield the global solution. The appropriation of relaxation for AC-OPF depends on the success or unfulflllment of the SDP relaxation. This paper demonstrates a quadratic AC-OPF problem with a single negative eigenvalue in objective function subject to linear and conic constraints. The proposed solution method for AC-OPF model covers the classical AC economic dispatch problem that is known to be NP-hard. In this paper, by combining successive linear conic optimization (SLCO), convex relaxation and line search technique, we present a global algorithm for AC-OPF which can locate a globally optimal solution to the underlying AC-OPF within given tolerance of global optimum solution via solving linear conic optimization problems. The proposed algorithm is examined on modified IEEE 6-bus test system. The promising numerical results are described. 
    more » « less
  3. Chiappa, Silvia ; Calandra, Roberto (Ed.)
    The article considers smooth optimization of functions on Lie groups. By generalizing NAG variational principle in vector space (Wibisono et al., 2016) to general Lie groups, continuous Lie-NAG dynamics which are guaranteed to converge to local optimum are obtained. They correspond to momentum versions of gradient flow on Lie groups. A particular case of SO(𝑛) is then studied in details, with objective functions corresponding to leading Generalized EigenValue problems: the Lie-NAG dynamics are first made explicit in coordinates, and then discretized in structure preserving fashions, resulting in optimization algorithms with faithful energy behavior (due to conformal symplecticity) and exactly remaining on the Lie group. Stochastic gradient versions are also investigated. Numerical experiments on both synthetic data and practical problem (LDA for MNIST) demonstrate the effectiveness of the proposed methods as optimization algorithms (\emph{not} as a classification method). 
    more » « less
  4. Low-rank matrix recovery is a fundamental problem in machine learning with numerous applications. In practice, the problem can be solved by convex optimization namely nuclear norm minimization, or by non-convex optimization as it is well-known that for low-rank matrix problems like matrix sensing and matrix completion, all local optima of the natural non-convex objectives are also globally optimal under certain ideal assumptions. In this paper, we study new approaches for matrix sensing in a semi-random model where an adversary can add any number of arbitrary sensing matrices. More precisely, the problem is to recover a low-rank matrix $X^\star$ from linear measurements $b_i = \langle A_i, X^\star \rangle$, where an unknown subset of the sensing matrices satisfies the Restricted Isometry Property (RIP) and the rest of the $A_i$'s are chosen adversarially. It is known that in the semi-random model, existing non-convex objectives can have bad local optima. To fix this, we present a descent-style algorithm that provably recovers the ground-truth matrix $X^\star$. For the closely-related problem of semi-random matrix completion, prior work [CG18] showed that all bad local optima can be eliminated by reweighting the input data. However, the analogous approach for matrix sensing requires reweighting a set of matrices to satisfy RIP, which is a condition that is NP-hard to check. Instead, we build on the framework proposed in [KLL$^+$23] for semi-random sparse linear regression, where the algorithm in each iteration reweights the input based on the current solution, and then takes a weighted gradient step that is guaranteed to work well locally. Our analysis crucially exploits the connection between sparsity in vector problems and low-rankness in matrix problems, which may have other applications in obtaining robust algorithms for sparse and low-rank problems. 
    more » « less
  5. Tucker decomposition is a popular technique for many data analysis and machine learning applications. Finding a Tucker decomposition is a nonconvex optimization problem. As the scale of the problems increases, local search algorithms such as stochastic gradient descent have become popular in practice. In this paper, we characterize the optimization landscape of the Tucker decomposition problem. In particular, we show that if the tensor has an exact Tucker decomposition, for a standard nonconvex objective of Tucker decomposition, all local minima are also globally optimal. We also give a local search algorithm that can nd an approximate local (and global) optimal solution in polynomial time. 
    more » « less