skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Approximation Algorithms for Orthogonal Non-negative Matrix Factorization
In the non-negative matrix factorization (NMF) problem, the input is an m×n matrix M with non-negative entries and the goal is to factorize it as M ≈ AW. The m × k matrix A and the k × n matrix W are both constrained to have non-negative entries. This is in contrast to singular value decomposition, where the matrices A and W can have negative entries but must satisfy the orthogonality constraint: the columns of A are orthogonal and the rows of W are also orthogonal. The orthogonal non-negative matrix factorization (ONMF) problem imposes both the non-negativity and the orthogonality constraints, and previous work showed that it leads to better performances than NMF on many clustering tasks. We give the first constant-factor approximation algorithm for ONMF when one or both of A and W are subject to the orthogonality constraint. We also show an interesting connection to the correlation clustering problem on bipartite graphs. Our experiments on synthetic and real-world data show that our algorithm achieves similar or smaller errors compared to previous ONMF algorithms while ensuring perfect orthogonality (many previous algorithms do not satisfy the hard orthogonality constraint).  more » « less
Award ID(s):
1908774
PAR ID:
10299301
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proeedings of the International Workshop on Artificial Intelligence and Statistics
Volume:
130
ISSN:
1525-531X
Page Range / eLocation ID:
2728-2736
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Non-negative matrix factorization (NMF), the problem of finding two non-negative low-rank factors whose product approximates an input matrix, is a useful tool for many data mining and scientific applications such as topic modeling in text mining and unmixing in microscopy. In this paper, we focus on scaling algorithms for NMF to very large sparse datasets and massively parallel machines by employing effective algorithms, communication patterns, and partitioning schemes that leverage the sparsity of the input matrix. We consider two previous works developed for related problems,one that uses a fine-grained partitioning strategy using a point-to-point communication pattern and one that uses a Cartesian, or checkerboard, partitioning strategy using a collective-based communication pattern. We show that a combination of the previous approaches balances the demands of the various computations within NMF algorithms and achieves high efficiency and scalability. From the experiments, we see that our proposed strategy runs up to 10x faster than the state of the art on real-world datasets. 
    more » « less
  2. null (Ed.)
    Nonnegative Matrix Factorization (NMF) is broadly used to determine class membership in a variety of clustering applications. From movie recommendations and image clustering to visual feature extractions, NMF has applications to solve a large number of knowledge discovery and data mining problems. Traditional optimization methods, such as the Multiplicative Updating Algorithm (MUA), solves the NMF problem by utilizing an auxiliary function to ensure that the objective monotonically decreases. Although the objective in MUA converges, there exists no proof to show that the learned matrix factors converge as well. Without this rigorous analysis, the clustering performance and stability of the NMF algorithms cannot be guaranteed. To address this knowledge gap, in this article, we study the factor-bounded NMF problem and provide a solution algorithm with proven convergence by rigorous mathematical analysis, which ensures that both the objective and matrix factors converge. In addition, we show the relationship between MUA and our solution followed by an analysis of the convergence of MUA. Experiments on both toy data and real-world datasets validate the correctness of our proposed method and its utility as an effective clustering algorithm. 
    more » « less
  3. QR factorization is a key tool in mathematics, computer science, operations research, and engineering. This paper presents the roundoff-error-free (REF) QR factorization framework comprising integer-preserving versions of the standard and the thin QR factorizations and associated algorithms to compute them. Specifically, the standard REF QR factorization factors a given matrix $$A \in \Z^{m \times n}$$ as $A=QDR$, where $$Q \in \Z^{m \times m}$$ has pairwise orthogonal columns, $$D$$ is a diagonal matrix, and $$R \in \Z^{m \times n}$$ is an upper trapezoidal matrix; notably, the entries of $$Q$$ and $$R$$ are integral, while the entries of $$D$$ are reciprocals of integers. In the thin REF QR factorization, $$Q \in \Z^{m \times n}$$ also has pairwise orthogonal columns, and $$R \in \Z^{n \times n}$$ is also an upper triangular matrix. In contrast to traditional (i.e., floating-point) QR factorizations, every operation used to compute these factors is integral; thus, REF QR is guaranteed to be an exact orthogonal decomposition. Importantly, the bit-length of every entry in the REF QR factorizations (and within the algorithms to compute them) is bounded polynomially. Notable applications of our REF QR factorizations include finding exact least squares or exact basic solutions (i.e., a rational n-dimensional vector $$x$$) to any given full column rank or rank deficient linear system $A x = b$, respectively. In addition, our exact factorizations can be used as a subroutine within exact and/or high-precision quadratic programming. Altogether, REF QR provides a framework to obtain exact orthogonal factorizations of any rational matrix (as any rational/decimal matrix can be easily transformed into an integral matrix). 
    more » « less
  4. This article reports the study of algorithms for non-negative matrix factorization (NMF) in various applications involving smoothly varying data such as time or temperature series diffraction data on a dense grid of points. Utilizing the continual nature of the data, a fast two-stage algorithm is developed for highly efficient and accurate NMF. In the first stage, an alternating non-negative least-squares framework is used in combination with the active set method with a warm-start strategy for the solution of subproblems. In the second stage, an interior point method is adopted to accelerate the local convergence. The convergence of the proposed algorithm is proved. The new algorithm is compared with some existing algorithms in benchmark tests using both real-world data and synthetic data. The results demonstrate the advantage of the algorithm in finding high-precision solutions. 
    more » « less
  5. Non-negative matrix factorization (NMF) is a highly celebrated algorithm for matrix decomposition that guarantees non-negative factors. The underlying optimization problem is computationally intractable, yet in practice, gradient-descent-based methods often find good solutions. In this paper, we revisit the NMF optimization problem and analyze its loss landscape in non-worst-case settings. It has recently been observed that gradients in deep networks tend to point towards the final minimizer throughout the optimization procedure. We show that a similar property holds (with high probability) for NMF, provably in a non-worst case model with a planted solution, and empirically across an extensive suite of real-world NMF problems. Our analysis predicts that this property becomes more likely with growing number of parameters, and experiments suggest that a similar trend might also hold for deep neural networks---turning increasing dataset sizes and model sizes into a blessing from an optimization perspective. 
    more » « less