K-means clustering is a widely used machine learning method for identifying patterns in large datasets. Recently, semidefinite programming (SDP) relaxations have been proposed for solving the K-means optimization problem, which enjoy strong statistical optimality guarantees. However, the prohibitive cost of implementing an SDP solver renders these guarantees inaccessible to practical datasets. In contrast, nonnegative matrix factorization (NMF) is a simple clustering algorithm widely used by machine learning practitioners, but it lacks a solid statistical underpinning and theoretical guarantees. In this paper, we consider an NMF-like algorithm that solves a nonnegative low-rank restriction of the SDP-relaxed K-means formulation using a nonconvex Burer--Monteiro factorization approach. The resulting algorithm is as simple and scalable as state-of-the-art NMF algorithms while also enjoying the same strong statistical optimality guarantees as the SDP. In our experiments, we observe that our algorithm achieves significantly smaller mis-clustering errors compared to the existing state-of-the-art while maintaining scalability.
more »
« less
Analysis of student essays in an introductory physics course using natural language processing
We analyzed the essays that were written on various topics in an introductory physics course using two unsupervised machine learning algorithms. One of them was Latent Dirichlet Allocation (LDA). This algorithm is used for extracting abstract topics from a collection of text documents. The other algorithm was Non-negative Matrix Factorization (NMF). It is used for similar purposes but also in other domains such as image recognition. We applied these two algorithms to the dataset that consisted of N=683 student essays. Although there were some built-in, important differences between LDA and NMF, they both found similar topics in our data by large. This offers instructors a promising and productive way of accessing useful information about their students' written work, especially in large-enrollment classes.
more »
« less
- Award ID(s):
- 2300645
- PAR ID:
- 10610778
- Publisher / Repository:
- American Association of Physics Teachers
- Date Published:
- Page Range / eLocation ID:
- 58 to 63
- Format(s):
- Medium: X
- Location:
- Sacramento, CA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this paper we propose a quasi-Newton algorithm for the celebrated nonnegative matrix factorization (NMF) problem. The proposed algorithm falls into the general framework of Gauss-Newton and Levenberg-Marquardt methods. However, these methods were not able to handle constraints, which is present in NMF. One of the key contributions in this paper is to apply alternating direction method of multipliers (ADMM) to obtain the iterative update from this Gauss-Newton-like algorithm. Furthermore, we carefully study the structure of the Jacobian Gramian matrix given by the Gauss-Newton updates, and designed a way of exactly inverting the matrix with complexity $$\cO(mnk)$$, which is a significant reduction compared to the naive implementation of complexity $$\cO((m+n)^3k^3)$$. The resulting algorithm, which we call NLS-ADMM, enjoys fast convergence rate brought by the quasi-Newton algorithmic framework, while maintaining low per-iteration complexity similar to that of alternating algorithms. Numerical experiments on synthetic data confirms the efficiency of our proposed algorithm.more » « less
-
In this paper we propose a quasi-Newton algorithm for the celebrated nonnegative matrix factorization (NMF) problem. The proposed algorithm falls into the general framework of Gauss-Newton and Levenberg-Marquardt methods. However, these methods were not able to handle constraints, which is present in NMF. One of the key contributions in this paper is to apply alternating direction method of multipliers (ADMM) to obtain the iterative update from this Gauss-Newton-like algorithm. Furthermore, we carefully study the structure of the Jacobian Gramian matrix given by the Gauss-Newton updates, and designed a way of exactly inverting the matrix with complexity $$\cO(mnk)$$, which is a significant reduction compared to the naive implementation of complexity $$\cO((m+n)^3k^3)$$. The resulting algorithm, which we call NLS-ADMM, enjoys fast convergence rate brought by the quasi-Newton algorithmic framework, while maintaining low per-iteration complexity similar to that of alternating algorithms. Numerical experiments on synthetic data confirms the efficiency of our proposed algorithm. \end{abstract}more » « less
-
null (Ed.)Nonnegative Matrix Factorization (NMF) is broadly used to determine class membership in a variety of clustering applications. From movie recommendations and image clustering to visual feature extractions, NMF has applications to solve a large number of knowledge discovery and data mining problems. Traditional optimization methods, such as the Multiplicative Updating Algorithm (MUA), solves the NMF problem by utilizing an auxiliary function to ensure that the objective monotonically decreases. Although the objective in MUA converges, there exists no proof to show that the learned matrix factors converge as well. Without this rigorous analysis, the clustering performance and stability of the NMF algorithms cannot be guaranteed. To address this knowledge gap, in this article, we study the factor-bounded NMF problem and provide a solution algorithm with proven convergence by rigorous mathematical analysis, which ensures that both the objective and matrix factors converge. In addition, we show the relationship between MUA and our solution followed by an analysis of the convergence of MUA. Experiments on both toy data and real-world datasets validate the correctness of our proposed method and its utility as an effective clustering algorithm.more » « less
-
Linear discriminant analysis (LDA) is widely used for dimensionality reduction under supervised learning settings. Traditional LDA objective aims to minimize the ratio of squared Euclidean distances that may not perform optimally on noisy data sets. Multiple robust LDA objectives have been proposed to address this problem, but their implementations have two major limitations. One is that their mean calculations use the squared l2-norm distance to center the data, which is not valid when the objective does not use the Euclidean distance. The second problem is that there is no generalized optimization algorithm to solve different robust LDA objectives. In addition, most existing algorithms can only guarantee the solution to be locally optimal, rather than globally optimal. In this paper, we review multiple robust loss functions and propose a new and generalized robust objective for LDA. Besides, to better remove the mean value within data, our objective uses an optimal way to center the data through learning. As one important algorithmic contribution, we derive an efficient iterative algorithm to optimize the resulting non-smooth and non-convex objective function. We theoretically prove that our solution algorithm guarantees that both the objective and the solution sequences converge to globally optimal solutions at a sub-linear convergence rate. The experimental results demonstrate the effectiveness of our new method, achieving significant improvements compared to the other competing methods.more » « less
An official website of the United States government

