skip to main content


Title: Distributed-Memory Parallel JointNMF
Joint Nonnegative Matrix Factorization (JointNMF) is a hybrid method for mining information from datasets that contain both feature and connection information. We propose distributed-memory parallelizations of three algorithms for solving the JointNMF problem based on Alternating Nonnegative Least Squares, Projected Gradient Descent, and Projected Gauss-Newton. We extend well-known communication-avoiding algorithms using a single processor grid case to our coupled case on two processor grids. We demonstrate the scalability of the algorithms on up to 960 cores (40 nodes) with 60\% parallel efficiency. The more sophisticated Alternating Nonnegative Least Squares (ANLS) and Gauss-Newton variants outperform the first-order gradient descent method in reducing the objective on large-scale problems. We perform a topic modelling task on a large corpus of academic papers that consists of over 37 million paper abstracts and nearly a billion citation relationships, demonstrating the utility and scalability of the methods.  more » « less
Award ID(s):
2106920
PAR ID:
10425100
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 37th International Conference on Supercomputing
Page Range / eLocation ID:
301 to 312
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper we propose a quasi-Newton algorithm for the celebrated nonnegative matrix factorization (NMF) problem. The proposed algorithm falls into the general framework of Gauss-Newton and Levenberg-Marquardt methods. However, these methods were not able to handle constraints, which is present in NMF. One of the key contributions in this paper is to apply alternating direction method of multipliers (ADMM) to obtain the iterative update from this Gauss-Newton-like algorithm. Furthermore, we carefully study the structure of the Jacobian Gramian matrix given by the Gauss-Newton updates, and designed a way of exactly inverting the matrix with complexity $\cO(mnk)$, which is a significant reduction compared to the naive implementation of complexity $\cO((m+n)^3k^3)$. The resulting algorithm, which we call NLS-ADMM, enjoys fast convergence rate brought by the quasi-Newton algorithmic framework, while maintaining low per-iteration complexity similar to that of alternating algorithms. Numerical experiments on synthetic data confirms the efficiency of our proposed algorithm. 
    more » « less
  2. In this paper we propose a quasi-Newton algorithm for the celebrated nonnegative matrix factorization (NMF) problem. The proposed algorithm falls into the general framework of Gauss-Newton and Levenberg-Marquardt methods. However, these methods were not able to handle constraints, which is present in NMF. One of the key contributions in this paper is to apply alternating direction method of multipliers (ADMM) to obtain the iterative update from this Gauss-Newton-like algorithm. Furthermore, we carefully study the structure of the Jacobian Gramian matrix given by the Gauss-Newton updates, and designed a way of exactly inverting the matrix with complexity $\cO(mnk)$, which is a significant reduction compared to the naive implementation of complexity $\cO((m+n)^3k^3)$. The resulting algorithm, which we call NLS-ADMM, enjoys fast convergence rate brought by the quasi-Newton algorithmic framework, while maintaining low per-iteration complexity similar to that of alternating algorithms. Numerical experiments on synthetic data confirms the efficiency of our proposed algorithm. \end{abstract} 
    more » « less
  3. null (Ed.)
    Full waveform inversion (FWI) and least-squares reverse time migration (LSRTM) are popular imaging techniques that can be solved as PDE-constrained optimization problems. Due to the large-scale nature, gradient- and Hessian-based optimization algorithms are preferred in practice to find the optimizer iteratively. However, a balance between the evaluation cost and the rate of convergence needs to be considered. We propose the use of Anderson acceleration (AA), a popular strategy to speed up the convergence of fixed-point iterations, to accelerate a gradient descent method. We show that AA can achieve fast convergence that provides competitive results with some quasi-Newton methods. Independent of the dimensionality of the unknown parameters, the computational cost of implementing the method can be reduced to an extremely lowdimensional least-squares problem, which makes AA an attractive method for seismic inversion. 
    more » « less
  4. null (Ed.)
    Power system state estimation (PSSE) aims at finding the voltage magnitudes and angles at all generation and load buses, using meter readings and other available information. PSSE is often formulated as a nonconvex and nonlinear least-squares (NLS) cost function, which is traditionally solved by the Gauss-Newton method. However, Gauss-Newton iterations for minimizing nonconvex problems are sensitive to the initialization, and they can diverge. In this context, we advocate a deep neural network (DNN) based “trainable regularizer” to incorporate prior information for accurate and reliable state estimation. The resulting regularized NLS does not admit a neat closed form solution. To handle this, a novel end-to-end DNN is constructed subsequently by unrolling a Gauss-Newton-type solver which alternates between least-squares loss and the regularization term. Our DNN architecture can further offer a suite of advantages, e.g., accommodating network topology via graph neural networks based prior. Numerical tests using real load data on the IEEE 118-bus benchmark system showcase the improved estimation performance of the proposed scheme compared with state-of-the-art alternatives. Interestingly, our results suggest that a simple feed forward network based prior implicitly exploits the topology information hidden in data. 
    more » « less
  5. null (Ed.)
    State-of-the-art seismic imaging techniques treat inversion tasks such as full-waveform inversion (FWI) and least-squares reverse time migration (LSRTM) as partial differential equation-constrained optimization problems. Due to the large-scale nature, gradient-based optimization algorithms are preferred in practice to update the model iteratively. Higher-order methods converge in fewer iterations but often require higher computational costs, more line-search steps, and bigger memory storage. A balance among these aspects has to be considered. We have conducted an evaluation using Anderson acceleration (AA), a popular strategy to speed up the convergence of fixed-point iterations, to accelerate the steepest-descent algorithm, which we innovatively treat as a fixed-point iteration. Independent of the unknown parameter dimensionality, the computational cost of implementing the method can be reduced to an extremely low dimensional least-squares problem. The cost can be further reduced by a low-rank update. We determine the theoretical connections and the differences between AA and other well-known optimization methods such as L-BFGS and the restarted generalized minimal residual method and compare their computational cost and memory requirements. Numerical examples of FWI and LSRTM applied to the Marmousi benchmark demonstrate the acceleration effects of AA. Compared with the steepest-descent method, AA can achieve faster convergence and can provide competitive results with some quasi-Newton methods, making it an attractive optimization strategy for seismic inversion. 
    more » « less