Distributed-Memory Parallel JointNMF

Eswar, Srinivas; Cobb, Benjamin; Hayashi, Koby; Kannan, Ramakrishnan; Ballard, Grey; Vuduc, Richard; Park, Haesun

doi:10.1145/3577193.3593733

Citation Details

This content will become publicly available on June 21, 2024

Distributed-Memory Parallel JointNMF

Joint Nonnegative Matrix Factorization (JointNMF) is a hybrid method for mining information from datasets that contain both feature and connection information. We propose distributed-memory parallelizations of three algorithms for solving the JointNMF problem based on Alternating Nonnegative Least Squares, Projected Gradient Descent, and Projected Gauss-Newton. We extend well-known communication-avoiding algorithms using a single processor grid case to our coupled case on two processor grids. We demonstrate the scalability of the algorithms on up to 960 cores (40 nodes) with 60\% parallel efficiency. The more sophisticated Alternating Nonnegative Least Squares (ANLS) and Gauss-Newton variants outperform the first-order gradient descent method in reducing the objective on large-scale problems. We perform a topic modelling task on a large corpus of academic papers that consists of over 37 million paper abstracts and nearly a billion citation relationships, demonstrating the utility and scalability of the methods. more »

Award ID(s):: 2106920

NSF-PAR ID:: 10425100

Author(s) / Creator(s):: Eswar, Srinivas; Cobb, Benjamin; Hayashi, Koby; Kannan, Ramakrishnan; Ballard, Grey; Vuduc, Richard; Park, Haesun

Date Published:: 2023-06-21

Journal Name:: Proceedings of the 37th International Conference on Supercomputing

Page Range / eLocation ID:: 301 to 312

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 21, 2024
Conference Paper:
https://doi.org/10.1145/3577193.3593733

More Like this