Sparse tensor factorization is a popular tool in multi-way data analysis and is used in applications such as cybersecurity, recommender systems, and social network analysis. In many of these applications, the tensor is not known a priori and instead arrives in a streaming fashion for a potentially unbounded amount of time. Existing approaches for streaming sparse tensors are not practical for unbounded streaming because they rely on maintaining the full factorization of the data, which grows linearly with time. In this work, we present CP-stream, an algorithm for streaming factorization in the model of the canonical polyadic decomposition which does not grow linearly in time or space, and is thus practical for long-term streaming. Additionally, CP-stream incorporates user-specified constraints such as non-negativity which aid in the stability and interpretability of the factorization. An evaluation of CP-stream demonstrates that it converges faster than state-of-the-art streaming algorithms while achieving lower reconstruction error by an order of magnitude. We also evaluate it on real-world sparse datasets and demonstrate its usability in both network traffic analysis and discussion tracking. Our evaluation uses exclusively public datasets and our source code is released to the public as part of SPLATT, an open source high-performance tensor factorization toolkit.
more »
« less
FACTORIZATION LENGTH DISTRIBUTION FOR AFFINE SEMIGROUPS III: MODULAR EQUIDISTRIBUTION FOR NUMERICAL SEMIGROUPS WITH ARBITRARILY MANY GENERATORS
Abstract For numerical semigroups with a specified list of (not necessarily minimal) generators, we describe the asymptotic distribution of factorization lengths with respect to an arbitrary modulus. In particular, we prove that the factorization lengths are equidistributed across all congruence classes that are not trivially ruled out by modular considerations.
more »
« less
- Award ID(s):
- 1800123
- PAR ID:
- 10340356
- Date Published:
- Journal Name:
- Journal of the Australian Mathematical Society
- Volume:
- 113
- Issue:
- 1
- ISSN:
- 1446-7887
- Page Range / eLocation ID:
- 21 to 35
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Several recent papers have examined a rational polyhedronPmwhose integer points are in bijection with the numerical semigroups (cofinite, additively closed subsets of the non-negative integers) containingm. A combinatorial description of the faces ofPmwas recently introduced, one that can be obtained from the divisibility posets of the numerical semigroups a given face contains. In this paper, we study the faces ofPmcontaining arithmetical numerical semigroups and those containing certain glued numerical semigroups, as an initial step towards better understanding the full face structure ofPm. In most cases, such faces only contain semigroups from these families, yielding a tight connection to the geometry ofPm.more » « less
-
The factorization method was introduced by Schrödinger in 1940. Its use in bound-state problems is widely known, including in supersymmetric quantum mechanics; one can create a factorization chain, which simultaneously solves a sequence of auxiliary Hamiltonians that share common eigenvalues with their adjacent Hamiltonians in the chain, except for the lowest eigenvalue. In this work, we generalize the factorization method to continuum energy eigenstates. Here, one does not generically have a factorization chain—instead all energies are solved using a “single-shot factorization”, enabled by writing the superpotential in a form that includes the logarithmic derivative of a confluent hypergeometric function. The single-shot factorization approach is an alternative to the conventional method of “deriving a differential equation and looking up its solution”, but it does require some working knowledge of confluent hypergeometric functions. This can also be viewed as a method for solving the Ricatti equation needed to construct the superpotential.more » « less
-
Rank selection, i.e. the choice of factorization rank, is the first step in constructing Nonnegative Matrix Factorization (NMF) models. It is a long-standing problem which is not unique to NMF, but arises in most models which attempt to decompose data into its underlying components. Since these models are often used in the unsupervised setting, the rank selection problem is further complicated by the lack of ground truth labels. In this paper, we review and empirically evaluate the most commonly used schemes for NMF rank selection.more » « less
An official website of the United States government

