NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Randomized Algorithms for Symmetric Nonnegative Matrix Factorization

https://doi.org/10.1137/24M1638355

Hayashi, Koby; Aksoy, Sinan G; Ballard, Grey; Park, Haesun (March 2025, SIAM Journal on Matrix Analysis and Applications)

Symmetric Nonnegative Matrix Factorization (SymNMF) is a technique in data analysis and machine learning that approximates a symmetric matrix with a product of a nonnegative, low-rank matrix and its transpose. To design faster and more scalable algorithms for SymNMF, we develop two randomized algorithms for its computation. The first algorithm uses randomized matrix sketching to compute an initial low-rank approximation to the input matrix and proceeds to rapidly compute a SymNMF of the approximation. The second algorithm uses randomized leverage score sampling to approximately solve constrained least squares problems. Many successful methods for SymNMF rely on (approximately) solving sequences of constrained least squares problems. We prove theoretically that leverage score sampling can approximately solve nonnegative least squares problems to a chosen accuracy with high probability. Additionally, we prove sampling complexity results for previously proposed hybrid sampling techniques which deterministically include high leverage score rows. This hybrid scheme is crucial for obtaining speedups in practice. Finally, we demonstrate that both methods work well in practice by applying them to graph clustering tasks on large real world data sets. These experiments show that our methods approximately maintain solution quality and achieve significant speedups for both large dense and large sparse problems.
more » « less
Free, publicly-accessible full text available March 31, 2026
On Rank Selection for Nonnegative Matrix Factorization

https://doi.org/10.1109/BigData62323.2024.10825324

Eswar, Srinivas; Hayashi, Koby; Cobb, Benjamin; Kannan, Ramakrishnan; Ballard, Grey; Vuduc, Richard; Park, Haesun (December 2024, IEEE)

Rank selection, i.e. the choice of factorization rank, is the first step in constructing Nonnegative Matrix Factorization (NMF) models. It is a long-standing problem which is not unique to NMF, but arises in most models which attempt to decompose data into its underlying components. Since these models are often used in the unsupervised setting, the rank selection problem is further complicated by the lack of ground truth labels. In this paper, we review and empirically evaluate the most commonly used schemes for NMF rank selection.
more » « less
Full Text Available
Distributed-Memory Parallel JointNMF

https://doi.org/10.1145/3577193.3593733

Eswar, Srinivas; Cobb, Benjamin; Hayashi, Koby; Kannan, Ramakrishnan; Ballard, Grey; Vuduc, Richard; Park, Haesun (June 2023, Proceedings of the 37th International Conference on Supercomputing)

Joint Nonnegative Matrix Factorization (JointNMF) is a hybrid method for mining information from datasets that contain both feature and connection information. We propose distributed-memory parallelizations of three algorithms for solving the JointNMF problem based on Alternating Nonnegative Least Squares, Projected Gradient Descent, and Projected Gauss-Newton. We extend well-known communication-avoiding algorithms using a single processor grid case to our coupled case on two processor grids. We demonstrate the scalability of the algorithms on up to 960 cores (40 nodes) with 60\% parallel efficiency. The more sophisticated Alternating Nonnegative Least Squares (ANLS) and Gauss-Newton variants outperform the first-order gradient descent method in reducing the objective on large-scale problems. We perform a topic modelling task on a large corpus of academic papers that consists of over 37 million paper abstracts and nearly a billion citation relationships, demonstrating the utility and scalability of the methods.
more » « less
Full Text Available
PLANC: Parallel Low-rank Approximation with Nonnegativity Constraints

https://doi.org/10.1145/3432185

Eswar, Srinivas; Hayashi, Koby; Ballard, Grey; Kannan, Ramakrishnan; Matheson, Michael A.; Park, Haesun (June 2021, ACM Transactions on Mathematical Software)
null (Ed.)
We consider the problem of low-rank approximation of massive dense nonnegative tensor data, for example, to discover latent patterns in video and imaging applications. As the size of data sets grows, single workstations are hitting bottlenecks in both computation time and available memory. We propose a distributed-memory parallel computing solution to handle massive data sets, loading the input data across the memories of multiple nodes, and performing efficient and scalable parallel algorithms to compute the low-rank approximation. We present a software package called Parallel Low-rank Approximation with Nonnegativity Constraints, which implements our solution and allows for extension in terms of data (dense or sparse, matrices or tensors of any order), algorithm (e.g., from multiplicative updating techniques to alternating direction method of multipliers), and architecture (we exploit GPUs to accelerate the computation in this work). We describe our parallel distributions and algorithms, which are careful to avoid unnecessary communication and computation, show how to extend the software to include new algorithms and/or constraints, and report efficiency and scalability results for both synthetic and real-world data sets.
more » « less
Full Text Available
Distributed-Memory Parallel Symmetric Nonnegative Matrix Factorization

https://doi.org/10.1109/SC41405.2020.00078

Eswar, Srinivas; Hayashi, Koby; Ballard, Grey; Kannan, Ramakrishnan; Vuduc, Richard; Park, Haesun (November 2020, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis)
null (Ed.)
Full Text Available
Parallel Nonnegative CP Decomposition of Dense Tensors

https://doi.org/10.1109/HiPC.2018.00012

Ballard, Grey; Hayashi, Koby; Ramakrishnan, Kannan (December 2018, 25th IEEE International Conference on High Performance Computing)

The CP tensor decomposition is a low-rank approximation of a tensor. We present a distributed-memory parallel algorithm and implementation of an alternating optimization method for computing a CP decomposition of dense tensors that can enforce nonnegativity of the computed low-rank factors. The principal task is to parallelize the Matricized-Tensor Times Khatri-Rao Product (MTTKRP) bottleneck subcomputation. The algorithm is computation efficient, using dimension trees to avoid redundant computation across MTTKRPs within the alternating method. Our approach is also communication efficient, using a data distribution and parallel algorithm across a multidimensional processor grid that can be tuned to minimize communication. We benchmark our software on synthetic as well as hyperspectral image and neuroscience dynamic functional connectivity data, demonstrating that our algorithm scales well to 100s of nodes (up to 4096 cores) and is faster and more general than the currently available parallel software.
more » « less
Full Text Available
Shared-memory parallelization of MTTKRP for dense tensors

https://doi.org/10.1145/3178487.3178522

Hayashi, Koby; Ballard, Grey; Jiang, Yujie; Tobia, Michael J. (January 2018, 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)

The matricized-tensor times Khatri-Rao product (MTTKRP) is the computational bottleneck for algorithms computing CP decompositions of tensors. In this work, we develop shared-memory parallel algorithms for MTTKRP involving dense tensors. The algorithms cast nearly all of the computation as matrix operations in order to use optimized BLAS subroutines, and they avoid reordering tensor entries in memory. We use our parallel implementation to compute a CP decomposition of a neuroimaging data set and achieve a speedup of up to 7.4X over existing parallel software.
more » « less
Full Text Available
Dynamic functional connectivity and individual differences in emotions during social stress: Stress and Brain Synchrony

https://doi.org/10.1002/hbm.23821

Tobia, Michael J.; Hayashi, Koby; Ballard, Grey; Gotlib, Ian H.; Waugh, Christian E. (December 2017, Human Brain Mapping)

Search for: All records