This paper considers the minimization of a general objective function f (X) over the set of nonsquare n × m matrices where the optimal solution X* is lowrank. To reduce the computational burden, we factorize the variable X into a product of two smaller matrices and optimize over these two matrices instead of X. We analyze the global geometry for a general and yet wellconditioned objective function f (X) whose restricted strong convexity and restricted strong smoothness constants are comparable. In particular, we show that the reformulated objective function has no spurious local minima and obeys the strict saddle property. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) can provably solve the factored problem with global convergence.
more »
« less
The nonconvex geometry of lowrank matrix optimizations with general objective functions
This work considers the minimization of a general convex function f (X) over the cone of positive semidefinite matrices whose optimal solution X* is of lowrank. Standard firstorder convex solvers require performing an eigenvalue decomposition in each iteration, severely limiting their scalability. A natural nonconvex reformulation of the problem factors the variable X into the product of a rectangular matrix with fewer columns and its transpose. For a special class of matrix sensing and completion problems with quadratic objective functions, local search algorithms applied to the factored problem have been shown to be much more efficient and, in spite of being nonconvex, to converge to the global optimum. The purpose of this work is to extend this line of study to general convex objective functions f (X) and investigate the geometry of the resulting factored formulations. Specifically, we prove that when f (X) satisfies the restricted wellconditioned assumption, each critical point of the factored problem either corresponds to the optimal solution X* or a strict saddle where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation ensures that many local search algorithms can converge to the global optimum with random initializations.
more »
« less
 Award ID(s):
 1464205
 NSFPAR ID:
 10099581
 Date Published:
 Journal Name:
 The nonconvex geometry of lowrank matrix optimizations with general objective functions
 Page Range / eLocation ID:
 1235 to 1239
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Newly, there has been significant research interest in the exact solution of the AC optimal power flow (ACOPF) problem. A semideflnite relaxation solves many OPF problems globally. However, the real problem exists in which the semidefinite relaxation fails to yield the global solution. The appropriation of relaxation for ACOPF depends on the success or unfulflllment of the SDP relaxation. This paper demonstrates a quadratic ACOPF problem with a single negative eigenvalue in objective function subject to linear and conic constraints. The proposed solution method for ACOPF model covers the classical AC economic dispatch problem that is known to be NPhard. In this paper, by combining successive linear conic optimization (SLCO), convex relaxation and line search technique, we present a global algorithm for ACOPF which can locate a globally optimal solution to the underlying ACOPF within given tolerance of global optimum solution via solving linear conic optimization problems. The proposed algorithm is examined on modified IEEE 6bus test system. The promising numerical results are described.more » « less

Chiappa, Silvia ; Calandra, Roberto (Ed.)The article considers smooth optimization of functions on Lie groups. By generalizing NAG variational principle in vector space (Wibisono et al., 2016) to general Lie groups, continuous LieNAG dynamics which are guaranteed to converge to local optimum are obtained. They correspond to momentum versions of gradient flow on Lie groups. A particular case of SO(𝑛) is then studied in details, with objective functions corresponding to leading Generalized EigenValue problems: the LieNAG dynamics are first made explicit in coordinates, and then discretized in structure preserving fashions, resulting in optimization algorithms with faithful energy behavior (due to conformal symplecticity) and exactly remaining on the Lie group. Stochastic gradient versions are also investigated. Numerical experiments on both synthetic data and practical problem (LDA for MNIST) demonstrate the effectiveness of the proposed methods as optimization algorithms (\emph{not} as a classification method).more » « less

Lowrank matrix recovery is a fundamental problem in machine learning with numerous applications. In practice, the problem can be solved by convex optimization namely nuclear norm minimization, or by nonconvex optimization as it is wellknown that for lowrank matrix problems like matrix sensing and matrix completion, all local optima of the natural nonconvex objectives are also globally optimal under certain ideal assumptions. In this paper, we study new approaches for matrix sensing in a semirandom model where an adversary can add any number of arbitrary sensing matrices. More precisely, the problem is to recover a lowrank matrix $X^\star$ from linear measurements $b_i = \langle A_i, X^\star \rangle$, where an unknown subset of the sensing matrices satisfies the Restricted Isometry Property (RIP) and the rest of the $A_i$'s are chosen adversarially. It is known that in the semirandom model, existing nonconvex objectives can have bad local optima. To fix this, we present a descentstyle algorithm that provably recovers the groundtruth matrix $X^\star$. For the closelyrelated problem of semirandom matrix completion, prior work [CG18] showed that all bad local optima can be eliminated by reweighting the input data. However, the analogous approach for matrix sensing requires reweighting a set of matrices to satisfy RIP, which is a condition that is NPhard to check. Instead, we build on the framework proposed in [KLL$^+$23] for semirandom sparse linear regression, where the algorithm in each iteration reweights the input based on the current solution, and then takes a weighted gradient step that is guaranteed to work well locally. Our analysis crucially exploits the connection between sparsity in vector problems and lowrankness in matrix problems, which may have other applications in obtaining robust algorithms for sparse and lowrank problems.more » « less

Tucker decomposition is a popular technique for many data analysis and machine learning applications. Finding a Tucker decomposition is a nonconvex optimization problem. As the scale of the problems increases, local search algorithms such as stochastic gradient descent have become popular in practice. In this paper, we characterize the optimization landscape of the Tucker decomposition problem. In particular, we show that if the tensor has an exact Tucker decomposition, for a standard nonconvex objective of Tucker decomposition, all local minima are also globally optimal. We also give a local search algorithm that can nd an approximate local (and global) optimal solution in polynomial time.more » « less